Evaluating Foehn Occurrence in a Changing Climate Based on Reanalysis and Climate Model Data Using Machine Learning

Christoph Mony aInstitute for Atmospheric and Climate Science, ETH Zürich, Zurich, Switzerland

Search for other papers by Christoph Mony in
Current site
Google Scholar
PubMed
Close
,
Lukas Jansing aInstitute for Atmospheric and Climate Science, ETH Zürich, Zurich, Switzerland

Search for other papers by Lukas Jansing in
Current site
Google Scholar
PubMed
Close
, and
Michael Sprenger aInstitute for Atmospheric and Climate Science, ETH Zürich, Zurich, Switzerland

Search for other papers by Michael Sprenger in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

This study explores the possibilities of employing machine learning algorithms to predict foehn occurrence in Switzerland at a north Alpine (Altdorf) and south Alpine (Lugano) station from its synoptic fingerprint in reanalysis data and climate simulations. This allows for an investigation on a potential future shift in monthly foehn frequencies. First, inputs from various atmospheric fields from the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERAI) were used to train an XGBoost model. Here, similar predictive performance to previous work was achieved, showing that foehn can accurately be diagnosed from the coarse synoptic situation. In the next step, the algorithm was generalized to predict foehn based on the Community Earth System Model (CESM) ensemble simulations of a present-day and warming future climate. The best generalization between ERAI and CESM was obtained by including the present-day data in the training procedure and simultaneously optimizing two objective functions, namely, the negative log loss and squared mean loss, on both datasets, respectively. It is demonstrated that the same synoptic fingerprint can be identified in CESM climate simulation data. Finally, predictions for present-day and future simulations were verified and compared for statistical significance. Our model is shown to produce valid output for most months, revealing that south foehn in Altdorf is expected to become more common during spring, while north foehn in Lugano is expected to become more common during summer.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Christoph Mony, christoph.mony@usys.ethz.ch

Abstract

This study explores the possibilities of employing machine learning algorithms to predict foehn occurrence in Switzerland at a north Alpine (Altdorf) and south Alpine (Lugano) station from its synoptic fingerprint in reanalysis data and climate simulations. This allows for an investigation on a potential future shift in monthly foehn frequencies. First, inputs from various atmospheric fields from the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERAI) were used to train an XGBoost model. Here, similar predictive performance to previous work was achieved, showing that foehn can accurately be diagnosed from the coarse synoptic situation. In the next step, the algorithm was generalized to predict foehn based on the Community Earth System Model (CESM) ensemble simulations of a present-day and warming future climate. The best generalization between ERAI and CESM was obtained by including the present-day data in the training procedure and simultaneously optimizing two objective functions, namely, the negative log loss and squared mean loss, on both datasets, respectively. It is demonstrated that the same synoptic fingerprint can be identified in CESM climate simulation data. Finally, predictions for present-day and future simulations were verified and compared for statistical significance. Our model is shown to produce valid output for most months, revealing that south foehn in Altdorf is expected to become more common during spring, while north foehn in Lugano is expected to become more common during summer.

© 2021 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Christoph Mony, christoph.mony@usys.ethz.ch

1. Introduction

Since humans have started to settle in the mountains, foehn winds have impacted their lives. The word “foehn” is often referred to as a generic term for a strong, warm, and dry downslope wind (Richner and Hächler 2013), while WMO (1992) formally defines foehn as “wind [which is] warmed and dried by descent, in general on the lee side of a mountain.” Thus, foehn is a worldwide phenomenon that occurs under different names in North America (Chinook; Oard 1993), New Zealand (Nor’wester; Mcgowan and Sturman 1996; Simpson et al. 2014), or South America (Puelche; Beusch et al. 2018). However, local peculiarities might exist depending on the region of interest. Our study area, Switzerland, is located in central Europe and its landscape is shaped to about two-thirds by the European Alps. Here, the main Alpine crest separates the country into the two areas, on both of which foehn occurs due to particular synoptic conditions (Richner and Hächler 2013). Foehn to the northern side of the crest is called south foehn (e.g., in the Reuss or Rhine Valley), while foehn to the southern side is called north foehn (e.g., in the Leventina Valley). Both variants exhibit a pronounced seasonal frequency, with maximum foehn occurrence during the spring months. For a more in-depth discussion of both variants, their seasonal cycle, and synoptic causes, see Cetti et al. (2015) and Gerstgrasser (2017).

Due to its strong and warm characteristics, foehn is known to impact forest fires by drying out entire forest areas (Pezzatti et al. 2016) or driving the fire spread itself (Sharples et al. 2010). Many studies have stressed the impact of foehn on fires in Switzerland (Zumbrunnen et al. 2009; Richner and Hächler 2013; Pezzatti et al. 2016; Sprenger et al. 2016). In addition, foehn has been shown to have implications for the aviation industry, snowmelt in spring, agriculture, and air pollution (Steinacker 2006; Richner and Hächler 2013; Mayr et al. 2018). Thus, its potentially harmful nature demands an accurate forecast for early risk mitigation.

Foehn is traditionally forecasted from the output of numerical weather prediction (NWP) models and the ability of experienced meteorologists who are familiar with local conditions (Zweifel et al. 2016). However, due to the insufficient representation of the topography in NWP models, direct model output has been shown not to represent foehn flows accurately. Burri et al. (2007) showed in a case study that the Consortium for Small-Scale Modeling (COSMO) model overestimates the area affected by foehn and underestimates its wind speeds and temperature. Wilhelm et al. (2012) concluded in their more systematic review of the COSMO model that the foehn frequency is estimated best for Alpine stations but overestimated for Swiss plateau stations. The wind speed and temperature of foehn events were found to be forecasted unsatisfactory close to the Alpine ridge, agreeing with the findings of Hächler et al. (2011). Consequently, due to NWP models’ insufficiency in accurately capturing the finescale details of foehn flows, one is still heavily dependent on the skill of experienced forecasters (Richner and Hächler 2013). However, the available resources of the forecasters are scarce, calling for more scalable approaches (Mayr et al. 2018).

For this reason, in the past decades, model output statistics (MOS) gained more popularity (Glahn and Lowry 1972). Instead of exclusively looking at the raw output of NWP models, one applies statistical techniques to recognize the synoptic causes for foehn in the data. Widmer (1966) was the first one to predict foehn in Altdorf, Switzerland. To this end, he used Fisher’s discriminant analysis to establish an index based upon the forecasted surface level pressure difference and geopotential height difference over the Alps. If this index surpasses a certain threshold, foehn will set in during the next 12–36 h with a probability of 70%. Later, Courvoisier and Gutermann (1971) were able to refine this index. Until today, the Widmer index is used operationally as one of the measures to forecast foehn at the Swiss national weather service (MeteoSwiss). Later, Drechsel and Mayr (2008) analyzed ECMWF model data with MOS. They focused on the cross-Alpine pressure and potential temperature difference to forecast foehn in the Wipp Valley, Austria, and found that a reliable forecast can be established up to three days in advance. Last, Zweifel et al. (2016) successfully used a machine learning (ML) technique (i.e., logistic regression with L1-regularization) on COSMO and ECMWF models to forecast foehn in Altdorf and Piotta—two Swiss stations on the Alpine north and south side, respectively—15 and 39 h in advance. Moreover, in other parts of the world, Oard (1993) used a multiple linear regression model to forecast foehn in the Montana Rockies, while Mercer et al. (2008) applied several ML methods (viz., linear regression, neural networks, and support vector machines) in the Boulder region. However, a natural limitation of all these techniques comes down to the existence of objectively diagnosed and labeled foehn data. For this reason, we summarize different foehn diagnosis techniques, which nowcast foehn, in the next paragraph.

In the past, there was a lack of an objective foehn definition, and often subjective criteria for temperature, humidity, and wind speed were applied (Richner et al. 2014; Zweifel et al. 2016). Thus, Gutermann (1970) was the first to develop an objective and automated classification by utilizing Fisher’s discriminant analysis based on wind, temperature, and humidity anomalies at the valley station. Since this method did not rely upon physical foehn processes in the lee side, Vergeiner (2004) improved upon this approach in the Wipp Valley by including criteria for the potential temperature difference between a crest and the valley station and wind direction at both stations. Another approach was developed by Duerr (2008), which allowed using real-time data from the Swiss automatic weather stations to classify south foehn at various locations all over Switzerland. This method relied on objective thresholds derived from a statistical analysis of 10 years of data. Since 2008, MeteoSwiss has operationalized Duerr’s method to automatically classify south foehn for every 10-min interval (Richner and Hächler 2013). Besides, Cetti et al. (2015) utilized the same method to derive an objective index for north foehn in the Swiss canton Ticino. However, as Plavcan et al. (2014) annotated, the former methods’ main drawback is the necessary manual determination of variable thresholds for each location. Thus, Plavcan et al. (2014) were the first to employ a probabilistic method. By utilizing a Gaussian mixture model, they managed to not only derive a foehn index but also assign an uncertainty to each measurement. This method is thought to be location independent, and thus their most simple model is applicable even in the absence of a crest station at the cost of a few wrongly classified observations. Later, Plavcan and Mayr (2015) have shown that their method also generalizes well to other locations in the Alpine region. Last, Sprenger et al. (2017) also used a probabilistic approach by leveraging an AdaBoost algorithm on common foehn predictors in COSMO-7 data to recognize patterns that cause foehn in Altdorf. With their method, they were able to predict foehn correctly approximately three out of four times (depending on the threshold) and showed that it is possible to derive foehn from mesoscale conditions in NWP data. Reaching beyond the Alpine region, foehn detection has also been used in Antarctica, where Laffin et al. (2019) used an XGBoost model to classify different foehn strengths related to the ice melt.

Despite achieving accurate diagnosis and forecasts at the time scale of numerical weather prediction, the long-term evolution of foehn occurrence is still open to debate. Gutermann et al. (2012) showed that no long-term trend is discernible in the foehn time series until today. Nevertheless, even though the foehn frequency does not show a trend over the years, it proves to have a pronounced intra-annual variability. In this regard, trends have been observed for certain months (Gutermann et al. 2012; Cetti et al. 2015). Nonetheless, while this reflects the current situation, it remains unclear whether foehn occurrence will stay the same in a warming climate. Thus, the long-term evolution of foehn winds has also been subject to research in other regions of the world. For example, Miller and Schlegel (2006) assessed the climatology of Santa Ana winds, finding a shift of the yearly Santa Ana season while Guzman-Morales and Gershunov (2019) found a decrease in Santa Ana wind frequency for all months. However, no scientific knowledge on the future evolution of foehn flows in the Alps has been established so far. Thus, with this work, we attempted to close the gap for the development of monthly foehn frequency in Switzerland under a warming climate. Moreover, we propose a methodological approach that could, in principle, be applied worldwide.

To this end, we utilized an XGBoost model to identify foehn from its large-scale fingerprint in reanalysis data (i.e., ERAI) for which observational foehn data exists. Afterward, we generalized the model to a freely running ensemble climate simulation (i.e., CESM). Here, we utilized a constraint optimization procedure. Finally, the produced predictions allowed us to compare foehn occurrence in simulations of present-day and future climate conditions, and hence to judge about potential frequency shifts. Thus, the main objectives of this study were formulated in the following way:

  1. With which level of skill can Alpine foehn objectively be diagnosed from coarse NWP reanalysis data? What are the most relevant features on the meso- and synoptic scale?

  2. Is it possible to generalize the findings from reanalysis data to a freely running climate simulation and identify the same synoptic conditions?

  3. If so, how will the monthly frequency of Alpine foehn differ between present-day climate and a warming future climate?

The remainder of the paper is structured in the following way. In section 2, the data utilized in this study is presented. In section 3, we introduce concepts from statistical learning theory and how those were applied to our objectives. In section 4, the results are presented and discussed in the context of previous work. Finally, in section 5, we summarize our main findings, discuss potential limitations, and give an outlook to future work.

2. Data

a. Foehn observations

For the means of training and evaluating our algorithms, we required observational foehn data as target variable y. Since we formulated the analysis as a binary classification problem, each observation i in ERAI (i.e., each state of the atmosphere at 0000, 0600, 1200, and 1800 UTC; see section 2b) was associated with its target variable:
yi={1if foehn was prevalent0else..

As discussed in the introduction, foehn needs to be diagnosed by objective criteria. Hence, for our study, we relied on the methodology established by Duerr (2008) for south foehn and adapted and applied by Cetti et al. (2015) for north foehn. Their work resulted in a readily available foehn index with a temporal resolution of 10 min. Although the Duerr index faces limitations in the case of rare dimmerfoehn (Richner and Duerr 2015) or thunderstorm outflow events, Sprenger et al. (2017) argued that these deficiencies (in the form of misclassified samples) are overall negligible and the Duerr index captures the seasonal cycle as expected.

Duerr (2008) separated between foehn, mixed-foehn, and non-foehn cases. For mixed-foehn cases, the synoptic-scale situation favors foehn presence; however, local-scale conditions (e.g., prevailing cold pools) prevent the foehn from completely breaking through (Duerr 2008). Since our interest was targeted at the synoptic situation, mixed-foehn events were treated as foehn.

Finally, we upscaled the temporal resolution to 6 h (specifically to 0000, 0600, 1200, and 1800 UTC). In Gutermann et al. (2012) it was shown that the best transition from 10-min intervals to hourly observations in the morning, noon, and evening can be achieved if at least four out of six intervals indicate foehn. Moreover, it was argued that this four-out-of-six rule is invariant under a 30-min shift to the past or future. Since we possibly wanted to look at the synoptic situation amid foehn, we defined that foehn was prevalent if at least 40 out of 60 min, centered symmetrically around 0000, 0600, 1200, and 1800 UTC, showed the occurrence of a foehn wind. The results for each location are shown in Table 1. The mean foehn duration over all foehn events based on 10-min resolution data were around 7.5 h. Therefore, the upscaling of the Duerr index to 6-hourly resolution allows us to capture most foehn events, which is further underlined by the similar foehn occurrence in both datasets (cf. Table 1).

Table 1.

Summary of selected foehn characteristics for both measurement stations. While the mean foehn hours per year and mean duration columns are based on the 10-min-resolution Duerr index, the total samples and positive samples columns are based on the upscaled 6-hourly data.

Table 1.

b. ERAI data

We based our investigation on the ERAI reanalysis (Dee et al. 2011) from the period January 1981 until August 2019. ERAI encompasses various atmospheric fields at a spectral resolution of T255, which were interpolated on a regular latitude–longitude grid with a spatial horizontal grid resolution of 1° (in our area of interest, approximately 80 km and 110 km in longitudinal and latitudinal direction, respectively) on 60 vertical levels and a temporal resolution of 6 h (i.e., at 0000, 0600, 1200, and 1800 UTC).

Since, for our analysis, the synoptic conditions over central Europe were relevant, the area of interest was bounded to an extended Alpine region (42°–50°N, 0°–15°E). This choice was also motivated by the relevant variables in the Widmer test (Widmer 1966; Courvoisier and Gutermann 1971). Due to the later application of our algorithms on CESM data, the atmospheric fields were interpolated from the ERAI grid to the slightly coarser CESM grid (see Fig. 1). In total, this left us with 104 horizontal grid points. Next, we considered atmospheric fields at sea level, 900, 850, 700, and 500 hPa. A topography plot of ERAI is shown in Fig. 2a. It should be noted that the elevation of the Alps is considerably reduced compared to reality. However, the synoptic-scale conditions which cause foehn (i.e., low pressure systems) are unlikely to be affected by this.

Fig. 1.
Fig. 1.

Grid points of the CESM grid (blue). While the ERAI grid is defined on full degrees (intersection of parallels and meridians), CESM is slightly coarser and shifted. The locations of Altdorf (ALT) and Lugano (LUG) are marked in red. The olive points indicate the grid points in the Alps, which we removed to achieve better generalization to CESM (see section 3 for details).

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

Fig. 2.
Fig. 2.

Topography of (a) ERAI and (b) CESM in terms of averaged surface pressure. The locations of the south foehn station Altdorf (ALT) and the north foehn station Lugano (LUG) are shown in white. The black dotted frame indicates the area within which we removed the features for the CESM prediction (see section 3 for details).

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

Due to the vast number of atmospheric fields in ERAI, we relied on a physically motivated approach and only selected features relevant for foehn. Richner and Hächler (2013), Zweifel et al. (2016), Gerstgrasser (2017), and Sprenger et al. (2017) defined sea level pressure differences ΔSLP, geopotential height differences ΔZ, potential temperature differences Δθ, zonal wind flow U, and meridional wind flow V to be important. Most of the final features were calculated by determining the unique differences between all 104 horizontal grid points on selected pressure levels. For the potential temperature, we separated between horizontal differences (representing a horizontal temperature gradient) and the vertical differences (representing the stability of the atmosphere). For the wind speed, we took the raw value at each grid point and pressure level. Finally, we obtained the features described in Table 2. Each row in the final dataset represented one observation (i.e., state of the atmosphere) at a given date–time (e.g., 1800 UTC 3 May 1989). Each column in the final dataset represented one feature at a specific grid point (or difference) and pressure level (e.g., V at 48.53°N and 10°E on 700 hPa).

Table 2.

Summary of the final set of features used for our analysis. For ΔSLP, ΔZ, and Δθhor, we calculated the unique horizontal differences between all 104 grid points on a given pressure level, yielding (104 × 103)/2 = 5356 unique features per pressure level. For ΔZ we considered two pressure levels and consequentially faced double the amount of features. For Δθver, we determined the vertical differences between two pressure levels on all 104 grid points, giving us 104 features per pressure level difference. In total, this yielded 208 features due to the considered two pressure level differences. For U and V, we took the raw value on each of the 104 grid points per pressure level, resulting in 208 features due to the two pressure levels.

Table 2.

c. CESM data

In contrast to the ERAI reanalysis, CESM (Hurrell et al. 2013) is a freely running climate simulation with a shifted and slightly more coarse horizontal grid (around 95 and 105 km in longitudinal and latitudinal direction, see Fig. 1) on 30 vertical levels and available at a 6-hourly temporal resolution. During the CESM large ensemble project (Kay et al. 2015), CESM was employed in the form of an ensemble simulation in which each of the 35 members was initialized on 1 January 1920 and then integrated forward until 2100. The model simulations used in this study relied on RCP8.5 external forcing, as discussed in Van Vuuren et al. (2011). To obtain different weather realizations while still retaining the same climate, each ensemble member was initialized with slightly different initial conditions.

A crucial point to consider when dealing with climate models is the aspect of internal climate variability, which in turn has important implications for climate change projections, especially at regional or subdecadal scales (Kay et al. 2015). To control for the effects of internal climate variability, climate model ensembles are favorable over single climate simulations. However, the realism of simulated realities can be degraded due to biases inherent to the underlying models. This imposes the question of why we decided to base our analysis on the CESM ensemble. First, the variables in CESM are implemented on a three-dimensional grid with a similar horizontal spacing as ERAI. Furthermore, compared to the mostly daily resolution of other climate models, the 6-hourly temporal resolution of CESM (same as ERAI) allows to also capture more short-lived foehn events. Hence, with CESM we found a model which nicely matches ERAI spatially, temporally, and topographically (also compare Fig. 2). Second, generally good agreement has been found between ERAI and CESM for reproducing extratropical cyclone frequency, although CESM has a tendency to overestimate weak cyclones and the Mediterranean region has to be thoroughly evaluated (Raible et al. 2018). Moreover, CESM simulates atmospheric blocking situations over Europe reasonably well, although the model tends to slightly underestimate the number of blocked days compared to ERAI (Schaller et al. 2018). Also, Huguenin et al. (2020) have shown that the seasonal frequencies of atmospheric circulation types over central Europe indicate a good match between CESM and ERAI. Consequently, although some caveats exist, we consider the CESM ensemble framework as suitable for our analysis.

In our case, two time periods were available for analysis due to the efforts of Röthlisberger et al. (2020). First, we considered the years 1991–2000 as a representation of present-day climate (CESM-p). Second, we considered the ensemble simulations of the years 2091–2100 as potential realizations of future climate (CESM-f).

The CESM data for both periods were processed similarly to the ERAI data. Apart from the different grid, the CESM data were not directly available at pressure levels. Thus, we used the pressure values at the 30 model levels to linearly interpolate Z, θ, U, and V to the required pressure levels. In case interpolation was not possible (i.e., if the pressure level was below the lowest model level), we decided against extrapolating this field for the respective pressure level and grid point. In such cases, the pressure level likely intersected with the topography. A topography plot for CESM can be found in Fig. 2b. Note how the slight differences in horizontal resolution lead to a smoother resolution of the Alps compared to ERAI. Ultimately, the final features from Table 2 were calculated in the same way for CESM as for ERAI.

3. Methodology

Since we formulated this study as a supervised binary classification problem, the necessary terminology is introduced in section 3a. Ultimately, gradient boosted tree models (XGBoost) were harnessed for prediction and are thus introduced in section 3b. In section 3c, we employ these models and describe how foehn can be diagnosed from ERAI data. Finally, the modifications undertaken to generalize the models to CESM data are explained in section 3d.

a. General terminology

For our analysis, we relied on statistical learning theory to identify foehn from reanalysis and climate model data. For supervised binary classification problems, a range of different machine learning (ML) models evolved to distinguish between positive (e.g., foehn) and negative (e.g., no foehn) observations (see e.g., Hastie et al. 2009). All such models follow the formalization of a binary classification problem. First, N labeled observations or samples (x, y) are assembled in a dataset D={(xi,yi)}i=1N. Each xi=(xi1,,xiM)RM combines M different feature realizations and resembles a point in feature space X = (X1, …, XM) where the distribution of the features Xm is generally unknown. The label yi ∈ {0, 1} determines whether a sample belongs to the positive or negative class and allows one to derive a decision boundary in X.

To this end, a binary classification algorithm optimizes a function p=f(x;θ):RM[0,1] by minimizing the so-called loss function l(y, p). Here, p is the predicted probability for a sample and θ represents the parameters that are optimized during the training or fitting procedure. Formally, we write θ*=argminθi=1Nl(yi,pi). Usually, l resembles a distance between y and p and for binary classification the (negative) log loss:
lLL(yi,pi)=[yilog(pi)+(1yi)log(1pi)],
is a common choice. The final predicted label y^ can be determined from p by applying a shifted Heaviside function y^=Θc(p)=Θ(pc):[0,1]{0,1}. Here, the threshold c ∈ [0, 1] is optimized for a certain metric of interest (see section 3c).

However, since a complex algorithm (e.g., large |θ|) would be able to fit all observations in D, this would result in so-called overfitting. In this case, the algorithm achieves a perfect score on all samples it has been trained on but faces severe problems when classifying new samples. For this reason, D is usually split in the disjoint training set Dtrain and test set Dtest (Mehta et al. 2019). Now, the algorithm is fitted on Dtrain while the performance is evaluated on the before unseen Dtest, allowing judgments about the generalization power of the algorithm.

Next, we build upon this general description and discuss a particularly successful family of classification algorithms.

b. Gradient boosted trees and XGBoost

A powerful and widely used method to combine decision trees (a collection of ordered if-else splits partitioning X into regions, which are then associated with a scalar prediction value; as introduced in Breiman et al. 1984) to an ensemble lies in the principle of boosting. In boosting, each tree is trained to reduce the errors of the previous ensemble members. This method was first implemented by Friedman (2001) in gradient boosted trees. It is referred to the appendix for a more formal treatment of gradient boosted trees. A particularly useful characteristic of boosted decision trees is the so-called feature importance, which allows to aggregate the split information from each ensemble member and hence compare different features against each other.

The computational performance of gradient boosted trees was highly optimized by Chen and Guestrin (2016) in their more sophisticated open-source implementation named XGBoost. The XGBoost model has been successfully used in many past applications achieving high performance on structured datasets, and is frequently among the highest-ranked models in Kaggle competitions (Chen and Guestrin 2016; Mehta et al. 2019). Also, XGBoost has already been applied in the foehn community to classify foehn strength in Antarctica (Laffin et al. 2019). XGBoost performs better compared to simple gradient boosted trees libraries due to several algorithmic and computational optimizations (for details see Chen and Guestrin 2016).

Finally, we summarize the advantages and disadvantages of the XGBoost algorithm. XGBoost inherits many beneficial properties from the underlying decision trees and thus does not require feature normalization/scaling and is robust toward outliers in the features (Hastie et al. 2009). Moreover, by design, decision trees automatically select the features with the most predictive power and ignore highly correlated features afterward since those do not contain additional information (Hastie et al. 2009; Louppe 2014). Also, XGBoost improves upon some shortcomings of decision trees, mainly improving the predictive performance. However, there are also some disadvantages. First, the greatest drawback compared to a normal decision tree is the loss of interpretability since a boosted decision tree is no longer a tree and inspection of single trees does not provide valuable meaning anymore. However, this problem can partially be circumvented by the feature importance mentioned above. Second, due to many additional hyperparameters that require tuning through cross validation, XGBoost might be more difficult to handle by inexperienced users than other methods.

In the next section, XGBoost is employed for the prediction of foehn winds on ERAI data.1

c. Foehn predictability on ERAI

First, we split the ERAI data DERA into training set Dtrain and test set Dtest for both Altdorf and Lugano. We decided to use the years 1991–2000 for Dtest since CESM-p simulates this period. Consequently, the remainder of the data originating from the years 1981–90 and 2001–19 was used for Dtrain. Second, the hyperparameter optimization was conducted using threefold cross validation on Dtrain (see e.g., Hastie et al. 2009). Third, in order to evaluate model predictions on Dtest, the confusion or contingency matrix (see Table 3) and derived metrics were used (Hoens and Chawla 2013).

Table 3.

Shown is the XGBoost model performance on Dtest for Altdorf and Lugano in form of the confusion matrices (top and middle) and derived metrics (bottom).

Table 3.

A confusion matrix summarizes true positive (TP), false negative (FN), false positive (FP), and true negative (TN) predictions, where each of these values depends on the threshold c through the shifted Heaviside function Θc(p). Since we faced an imbalanced dataset (cf. Table 1), for model evaluation we relied on
correct alarm ratio(CAR)or precision=TPTP+FP,
which measures how many of all positively predicted cases were correct, and
probability of detection(POD)or recall=TPTP+FN,
which measures how many of all positive observations were correctly predicted (Hogan and Mason 2012). The positive class was defined as the rare class, and we evaluated the model along CAR/POD after balancing FP and FN predictions by adjusting the threshold c. Hereby, we ensured that FP and FN errors are weighted equally.

Additionally, in order to compare our model to other scientist’s work, we also calculated the following metrics: accuracy = (TP + TN)/(TP + FP + FN + TN) which is not a suitable metric for imbalanced datasets since it puts too much emphasis on the major class (Murphy 1996; Hoens and Chawla 2013); the F1-score, which is the harmonic mean between CAR and POD; and the so-called area under the receiver operating characteristic (AUROC), which aggregates the confusion matrix for all thresholds (Hogan and Mason 2012; Hoens and Chawla 2013). However, also for the AUROC, Davis and Goadrich (2006) have argued that it presents an overly optimistic view on the model’s performance for a large class imbalance and instead area under the precision-recall curve (AUPRC) might give a more informative view of an algorithm’s performance.

Finally, the scoring results of XGBoost for south foehn in Altdorf and north foehn in Lugano would serve as our baseline against which we compared the following steps.

d. Generalization to CESM

Next, we employed XGBoost also to CESM data. We moved from employing one XGBoost model for the whole year to utilizing 12 XGBoost models, one for each month, respectively. This measure allowed us to balance CAR and POD for each month (i.e., FP and FN predictions occur with the same frequency). Besides, the monthly models allowed for inspection of potentially varying feature importances over the year. Formally, we now write the superscript m = 1, …, 12 to clarify that we refer to monthly data (e.g., Dtrain(m)). Despite this additional step, the models underpredicted foehn occurrence on CESM-p, likely due to overfitting on ERAI specific details and hence having poor generalization to CESM data. For this reason, we employed a range of additional preprocessing measures.

  1. Disregarding local features. Since we ought to classify the synoptic-scale weather situation, we decided to exclude features in the Alpine region that are likely to be represented differently due to the deviating topography between ERAI and CESM. In particular, we excluded the 21 grid points inside the rectangle between 44.76°N, 3.75°E and 48.53°N, 13.75°E from DERA, DCESMp, and DCESMf (see Figs. 1 and 2).

  2. Quantile rescaling. To compensate for little biases and deviations in the representation of features between ERAI and CESM-p (induced by slightly differing strengths of the pressure gradients), we separately rescaled each feature in both datasets to its quantile representation. To avoid model leakage, the samples in Dtest were rescaled with the same function as for Dtrain. For CESM-f, we rescaled each feature with the function fitted to the corresponding feature in CESM-p, since otherwise, we would have likely eliminated the differences between CESM-p and CESM-f we wanted to investigate.

  3. Constrained optimization of the loss function. Finally, we adjusted the loss function l for the XGBoost models. More precisely, we introduced an additional loss term being optimized on monthly CESM-p samples DCESMp(m) in parallel to the optimization of the log loss [Eq. (2)] on Dtrain(m). We call this term squared mean error (SME) from here and formulate it as
    lSME(m)=λ[DCESMp(m)piNCESMp(m)Dtrain(m)yiNtrain(m)]2=λ[μCESMp(m)μtrain(m)]2,

    where NCESMp(m)=|DCESMp(m)| and Ntrain(m)=|Dtrain(m)|. Further, μ is the mean foehn frequency on DCESMp(m) or Dtrain(m). The SME acted as a constraint to the optimization procedure (similarly to regularization) to make the mean of DCESMp(m) close to the mean on Dtrain(m). With the parameter λ, one can control the strength of this constraint. Thus, the XGBoost models were forced into a trade-off. On the one hand, they were coerced to select features which allowed for an accurate prediction on Dtrain(m) through Eq. (2). On the other hand, the models were penalized for selecting features which made the predicted mean on DCESMp(m) too unequal compared to the observed mean on Dtrain(m) through Eq. (5). More details regarding the interpretation of the SME can be found in the appendix.

During the implementation for south foehn, we first selected the most important 250 features from an XGBoost model trained on all features of Dtrain via threefold cross validation. Then, we used those features to retrain the monthly models. To also utilize information from other months, the first 10 trees were trained on Dtrain without the constraint condition. In contrast, the next 190 trees were trained only on Dtrain(m) and DCESMp(m) with the constraint condition (λ = 60 000). Afterward, we employed the models to make predictions for all 35 ensemble members of CESM-p and CESM-f. To verify that the models captured the synoptic conditions, we visualized the mean SLP conditions during foehn in the form of composite plots for ERAI, CESM-p, and CESM-f. Here, we used the predicted label to determine a foehn situation (i.e., where y^i=1). The aggregated feature importances were generated by summing over the 30 highest ranked feature importances from each monthly model.

4. Results and discussion

a. Foehn predictability on ERAI

Table 3 shows the resulting confusion matrices and metrics of the XGBoost models for south and north foehn after adjusting the threshold to balance FP and FN predictions. As we obtain from the CAR/POD, slightly more than three out of four positive or positive-predicted samples are classified correctly.

With those metrics, the XGBoost models can be compared to previous studies (i.e., Zweifel et al. 2016; Sprenger et al. 2017). The results are summarized in Table 4. Our models show slightly increased performance compared to prior models for south foehn. For north foehn, we observe slightly lower scores; however, this could also be since Zweifel et al. (2016) predicted north foehn in Piotta, which shows foehn occurrence more frequently than Lugano. Nevertheless, the comparable skill is remarkable since our model is based on coarse ERAI data with a spatial resolution of approximately 100 km. In contrast, the COSMO-7, COSMO-2, and ECMWF models have a resolution of 7, 2.2, and 16 km, respectively. This finding also supports the claim from Drechsel and Mayr (2008) and Plavcan et al. (2014) that foehn is predictable from its synoptic fingerprint in NWP data as long as the topography is represented with sufficient accuracy.

Table 4.

Shown is the comparison of various evaluation metrics from our model and (quoted) previous scientific work. Note that the raw values are only comparable to some extent due to the different purposes and thresholds of the other projects. Sprenger et al. (2017) used an AdaBoost algorithm to nowcast foehn based upon the COSMO-7 model in Altdorf. Zweifel et al. (2016) used logistic regression with L1-regularization to predict foehn on COSMO-2 (C) and ECMWF (E) data in Altdorf and Piotta. However, note that Zweifel et al. (2016) forecasted foehn 15 h in advance and further used observational data from several weather stations in addition to the NWP model data. Due to the more pronounced class imbalance in Lugano compared to Piotta (foehn occurs less often), accuracy and AUROC present an overly optimistic view on our algorithm’s performance while CAR/POD/F1 score adjust for the respective class imbalance at each location.

Table 4.

The selected features and their importance in the final models are shown in Table 5. As one would expect from Zweifel et al. (2016), Sprenger et al. (2017), and Gerstgrasser (2017), the XGBoost models did select pressure and geopotential height differences as most important features for both south and north foehn. In addition, also the hydrostatic gradient induced by the difference in potential temperature discussed in Gerstgrasser (2017) appears to have played an important role for south foehn. Including these synoptic-scale features might enhance the performance of other models in the future.

Table 5.

Shown are the most important features and their importance in the final XGBoost models fitted to Dtrain for Altdorf (top) and Lugano (bottom).

Table 5.

Hence, we conclude the first objective of this work: In general, it is feasible to infer the existence of foehn from synoptic conditions on a grid as coarse as approximately 100 km. The skill is highly comparable to using NWP models with higher resolution, and the selected parameters coincide with the current state of knowledge.

b. Generalization to CESM

After applying the additional preprocessing measures described in section 3d, the metrics shown in Table 6 were obtained on the ERAI test set. Here, these metrics were calculated after aggregating the predictions from all monthly models. For south foehn, we found that little skill was lost compared to the baseline model that was fitted solely on ERAI (see Table 3), while for north foehn, the reduction in skill is slightly larger. Interestingly, the log loss decreased in both cases. This finding might be due to fewer samples being predicted completely off (e.g., yi = 1 but pi ≈ 0) since these are the mistakes the log loss punishes the harshest.

Table 6.

Shown are various evaluation metrics on Dtest after the constraint optimization.

Table 6.

To verify that the models actually captured the synoptic conditions related to foehn, we employed composite maps for observed (yi = 1) foehn cases in ERAI as well as predicted (y^i=1) foehn cases in ERAI, CESM-p, and CESM-f (Figs. 3 and 4 for south and north foehn, respectively). We investigated various atmospheric fields (see Table 2) but ultimately show the SLP here since those features ranked high in the feature importances (see Table 5) and also the literature argues that a strong pressure gradient is a necessary condition for foehn (Drechsel and Mayr 2008; Richner and Hächler 2013; Plavcan et al. 2014).

Fig. 3.
Fig. 3.

Shown are the mean SLP conditions during south foehn for (a) ERAI test samples with observed foehn, (b) ERAI test samples with predicted foehn, (c) CESM-p samples with predicted foehn, and (d) CESM-f samples with predicted foehn. Furthermore, the most important ΔSLPs are marked. The thicker the blue line, the more important is the specific difference. Due to the large feature importance of two features, here we depict the root of the features importance for visibility. The green dot marks the location of Altdorf.

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

Fig. 4.
Fig. 4.

As in Fig. 3, but for north foehn in Lugano. Note that here we did not scale the ΔSLP (blue lines) with the root of the feature importance.

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

In all cases, we observe the typical foehn pattern. For south foehn, the typical high pressure system over the Adriatic Sea, a low pressure system to the northwest of the Alps, and the “foehn knee” in the isobars are clearly discernible (Richner and Hächler 2013; Sprenger et al. 2016). The slight differences in the CESM composites are potentially linked to the different topography of the Alps (cf. Fig. 2) or the unequal representation of pressure differences related to the passage of synoptic-scale disturbances over Europe. The selected ΔSLP differences of the models are located in the region of the largest gradient. It is notable that the synoptic foehn conditions were reconstructed in the whole area of interest, although the models appeared to mainly look at a single feature. This observation agrees with the findings of Widmer (1966) and Courvoisier and Gutermann (1971), who also achieved high foehn predictability from four and two parameters, respectively. The pressure component of the Widmer index looks at the pressure difference between Venice and Tours (Courvoisier and Gutermann 1971). In contrast to this difference, we found that foehn can be diagnosed from more local features. However, this result has to be interpreted carefully since first, the Widmer index is used for forecasting purposes and second, the high spatial correlation among features might yield a similar classification performance when using features in spatial proximity. Also, for the case of north foehn, we obtain the synoptic situation described in previous literature (Kljun et al. 2001; Cetti et al. 2015). However, in contrast to south foehn, apparently more features had similar importance to the models. This agrees with Zweifel et al. (2016), who also found that more parameters were needed to describe north foehn accurately and could represent an explanation for the decreased prediction scores in Table 6.

We conclude that, in principle, it is possible to transition from ERAI data to CESM data and identify the synoptic fingerprint of foehn in both models. For south foehn, a few well-selected features sufficed to capture the whole synoptic situation over the Alps.

c. Comparison of monthly foehn frequencies

The mean monthly frequency was defined as the sum of samples which showed (yi = 1) or predicted (y^i=1) foehn divided by the sum of all samples within a specified month. Figures 5a and 6a summarize the monthly frequencies of all present-day datasets. Here, one data point represents the foehn frequency within a given month grouped by year, and additionally for CESM-p, by ensemble member. This grouping resulted in 10 data points for observational data and Dtest(m) (one for each year), and 350 data points for CESM-p (one for each year in each ensemble member). We see that our model mostly captures the annual cycle of foehn frequency in CESM-p, despite the large variation in frequency between ensemble members. However, some larger deviations exist between observational data and CESM-p in September for south foehn. In Figs. 5b and 6b, for the ease of interpretation, we compare the decadal frequency between CESM-p and CESM-f for each month. To this end, the monthly foehn frequency was averaged over all 10 realizations of a specific month in each ensemble member, yielding 35 datapoints for CESM-p and CESM-f, respectively. It is notable that, for certain months, the CESM-p and CESM-f scenarios show a considerable spread, i.e., during these months, more ensemble members show a larger or smaller foehn frequency in CESM-f. In the next steps, these differences are evaluated for their statistical significance.

  1. Observational data versus ERAI predictions. Here, we utilized an ordinary forecast verification approach in the form of the confusion matrix and derived metrics (see section 3c). Note that each month was treated with a different threshold such that POD and CAR coincided (i.e., the numbers of FP and FN predictions are equal). The idea of applying a different threshold for every month to compensate for potential external factors the model fails to capture has also been applied in the Widmer index (for details see Sprenger et al. 2017).

    First, for south foehn, we obtained that prediction worked best during spring, autumn, and winter. Here, our skill ranged from 0.71 to 0.85 as measured by the CAR/POD. During summer, our prediction skill was worst, residing around 0.65 in CAR/POD (see Fig. S1 in the online supplemental material). These findings also agree with Sprenger et al. (2017), who found the best predictability in winter/spring and the worst during summer. Further, Sprenger et al. (2017) explained that during summer, mechanisms like increased solar irradiance lead to a local heat low in the Alpine region. The resulting local pressure field and thermally driven valley circulation counteracts the foehn flow (Lotteraner 2009), although synoptic conditions might be in favor of foehn development, leading to more FP predictions (see Fig. S2 in supplemental material for a more detailed investigation of FP and FN composites). However, in our case, each FN prediction should, in principle, be compensated by an FP prediction and hence not affect the frequency. Also, it should be noted that the models were far from random predictions. In this case, we would find CAR = POD ≈ 0.02, which is the true observed frequency during summer. Nevertheless, since in June and July we had only 15 positive samples in Dtest(m), we rejected to trust in the model during these months.

    For north foehn, in most seasons, CAR/POD ranged around 0.67–0.79. Only during the months August, September, and October CAR/POD dropped to 0.61–0.66 (see Figs. S3 and S4 in supplemental material), since these were also the months with the least positive samples. However, again, each FN should be compensated by a FP and hence not affect the frequency. Furthermore, the models still showed elevated skill (>0.6) with a reasonable amount of positive foehn samples (>40). Thus, we did not reject any model for its suitability to diagnose north foehn.

  2. ERAI predictions versus CESM-p predictions. In this case, a challenge arose because CESM-p contained 350 years of data, ERAI, however, only 10 years. However, we argue that ERAI can be viewed as one potential ensemble member of CESM-p, i.e., one potential realization of the weather. Thus, to compare the ERAI and CESM-p predictions, we randomly sampled a month from all the ensemble members for each of the 10 years in CESM-p and calculated the mean foehn frequency. Then we repeated this procedure 1000 times and compared the resulting Gaussian distribution with the mean foehn frequency of ERAI.

    Of course, due to the constraint optimization (see section 3d), the ERAI frequency should, by design, be close to the CESM-p frequency. However, the chance existed that the XGBoost models failed to find meaningful features that would generalize well. Hence, if the ERAI mean fell outside the second standard deviation of the distribution, we rejected the hypothesis that they follow the same underlying distribution. In these cases, apparently, the model failed to generalize well. For south foehn, this only happened in September (see Fig. S5 in the supplemental material). For north foehn, the hypothesis that ERAI represents a possible realization of present-day climate as defined by CESM-p could be accepted for all months (see Fig. S6 in the supplemental material).

  3. CESM-p predictions versus CESM-f predictions. Last, we used a Wilcoxon rank-sum test and calculated a p value for each month to judge whether the distributions of CESM-p and CESM-f decadal foehn frequencies differ significantly. As global rate of error we used α = 0.05. Since we dealt with a multiple testing scenario, we further applied the Bonferroni correction, reducing the effective rejection level for each month to α˜=0.0042. If the p value fell below α˜, we rejected the null hypothesis that the two samples follow the same underlying distribution and conducted statistical significance. As we conclude from Fig. 5b, statistically significant differences are found in February, March, May, July, September, and October (although for April the p value is just above the Bonferroni corrected threshold). For north foehn (see Fig. 6b), we found significant differences for January, May, July, August, and September.

Fig. 5.
Fig. 5.

(a) Shown are the present-day observed or predicted monthly south foehn frequencies in Altdorf for the observed (10 samples), ERAI (10 samples), and CESM-p (350 samples) dataset. (b) Shown are the decadal means of south foehn frequency in Altdorf for each CESM-p and CESM-f ensemble member, i.e., we compare 35 potential decades of present-day climate with 35 decades of future climate here. Also shown are the p values from the Wilcoxon rank-sum test and their significance under the global α = 0.05 with Bonferroni correction. All significant results are marked with an asterisk.

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

Fig. 6.
Fig. 6.

As in Fig. 5, but for north foehn in Lugano.

Citation: Weather and Forecasting 36, 6; 10.1175/WAF-D-21-0036.1

As already argued above, we can only make a statistical statement for all ensemble members combined, i.e., whether we observe a significantly higher (or lower) foehn frequency between the CESM-p and CESM-f distributions over all ensemble members. For example, for Altdorf and February, we observe a decadal foehn frequency > 7.5% (i.e., >50 h) in half of the CESM-f members, while the same fraction in CESM-p only shows a decadal foehn frequency > 6.0% (i.e., >40 h). Summarizing all the above points leaves us with the following conclusion for the third objective: For south foehn, a valid significant increase in the foehn frequency between the CESM-p and CESM-f ensemble members can be inferred for February and March, while April is significant on only a slightly lower significance level. Thus, we expect a shift of the spring foehn maximum to earlier months. Furthermore, for May and October, we found a significant decrease in foehn frequency. For north foehn, a valid significant increase was observed in the foehn frequency for the two late-spring/summer months, May and July. Additionally, for January, August, and September, we found a significant decrease in foehn frequency. These findings can be compared to present trends in the foehn frequency in the literature. In contrast to the south foehn frequency increase in spring, Gutermann et al. (2012) found that March has been a foehn prone month until 1960, however, lost its dominant role afterward. Nonetheless, over the prolonged time span of the next 70 years, this pattern might reverse again, where potential causes have to be identified in further research. For north foehn, Cetti et al. (2015) found an increase in the foehn frequency for several north foehn stations for May, aligning well with our findings. However, again, the long time horizon of our analysis might be difficult to compare with present trends.

5. Conclusions

In this study, we established a projection for the monthly frequencies of south and north foehn under a warming future climate. To this end, we assessed with which skill foehn can be diagnosed from coarse ERAI reanalysis data using state-of-the-art ML models, then generalized the algorithms to freely running CESM climate simulations, and tested the obtained frequencies for significance. We conclude by summarizing our main findings related to the objectives stated in the Introduction (section 1).

  1. First, we scrutinized with which reliability foehn can be inferred from the synoptic situation in coarse NWP reanalysis data. Averaged over a 10-yr period, we managed to achieve a CAR/POD of 0.786 for south foehn and 0.779 for north foehn. When further resolving this on a monthly basis, the highest skill appears to be obtained during spring, autumn, and winter with CAR/POD ranging from 0.71 to 0.85 for south foehn. During the summer months, this skill degraded to approximately 0.65, where a possible explanation can be found in local-scale mechanisms that counteract the archetypal synoptic fingerprint, leading to more FP predictions. This corresponds well to the work of other researchers who obtained similar scores on higher-resolution NWP data. Furthermore, by inspecting composite plots, we found that the synoptic situation identified by the ML models corresponds well with the expectation based on prior physical knowledge. Hence, this suggests that the information contained in coarse reanalysis data suffices to predict foehn occurrence.

  2. Second, we saw that, with sufficient feature preprocessing to compensate for biases between climate models, the transition from a reanalysis to a freely running climate simulation is feasible at the cost of a few miss-classified samples. Nevertheless, the identified synoptic fingerprint agrees well between reanalysis and freely running climate simulations. It is notable that, for south foehn, the synoptic situation could be determined from a few well-selected parameters.

  3. Third, we investigated the transition from a present-day climate to a warming future climate. Besides a considerable ensemble spread, we found a significant increase in the south foehn frequency between CESM-p and CESM-f during February and March, and thus expect foehn to become more common during these months. For May and October, a significant decrease was found, and consequently, foehn is anticipated to become less frequent. For north foehn, we obtained a significant increase between CESM-p and CESM-f for May and July. Moreover, we observed a significant decrease in January, August, and September. For this reason, we reckon foehn to become more or less common during these months, respectively.

We arrived at our findings by making some limiting assumptions, which shall be discussed here. First, the results were obtained by applying several preprocessing steps in order to make a viable transition from ERAI to CESM. In the future, applying our methodology to a reanalysis and a climate simulation, which have a more similar (potentially even the same) topography and grid, would decrease the number of necessary assumptions to obtain a prediction. Furthermore, climate simulations with a present-day climate nudged to reanalysis data would allow us to omit the necessity of building the ML model based on the reanalysis and the subsequent generalization step. Second, our model separates between synoptic causes and data. Our methodology will not capture foehn if the latter will be caused by an entirely different synoptic situation in the future. Hence, it is important to stress that our methodology is based on present best knowledge what a synoptic foehn situation looks like and generalizes this from ERAI to CESM.

Finally, the methodology we have described in this work is, in principle, also applicable to other regions in the world (e.g., the Rocky Mountains) since ERAI and CESM are global models. One simply has to specify the area of interest and provide foehn diagnosis data based on station measurements. The described procedure will automatically and objectively select the most important features and apply this knowledge to make a projection for future climate. In the process, one can learn from the model which synoptic features matter most for foehn in the specified area. Furthermore, our results raise the question for the underlying physical driver behind the changes in foehn frequency. Potentially, shifts in future foehn frequencies are related to changes in occurrence and/or location of synoptic-scale disturbances over Europe. However, such an analysis was beyond the scope of this study and is planned to be conducted in further research.

Acknowledgments

First, we want to express our gratitude to Matteo Buzzi from the Swiss national weather service (MeteoSwiss; www.meteoswiss.ch) for providing us with the required observational foehn data. Furthermore, we are thankful to MeteoSwiss for granting access to ERAI data and Dr. Urs Beyerle for access to the CESM simulations. The data access was also supported by the H2020 European Research Council [INTEXseas (Grant 787652)]. Finally, we are also sincerely thankful to Dr. Lukas Gudmundsson, Dr. Matthias Röthlisberger, and Prof. Heini Wernli for acting as sparring partners challenging our ideas and providing valuable input about statistical learning theory and climate models. We also thank the reviewers for their time to provide us valuable and constructive feedback, resulting in an improved version of this manuscript. Lukas Jansing was funded by the Swiss National Science Foundation (Project: Foehn Dynamics—Lagrangian Analysis and Large-Eddy Simulation; Grant 181992).

Data availability statement

The data used in this study is available and can be obtained upon request from Dr. Michael Sprenger. The code used for the analysis is freely available under this GitHub repository.2

APPENDIX

Gradient Boosted Tree Constraint Optimization

We construct the boosted tree ensemble by
fBBT(x)=b=1BfbRT(x;θb).
Here, the fbRT(x;θb) are regression trees due to the sequential architecture of the algorithm, even for classification tasks. A popular choice for converting fBBT(x) to a probability is the logistic or sigmoid function p=p[fBBT(x)]=1/{1+exp[fBBT(x)]}. Hence, the aim of the gradient boosted tree algorithm is maximizing fBBT(xi) for yi = 1 while minimizing it for yi = 0 and mapping the result to the corresponding probability afterward.
As indicated in section 3a, this can be achieved by minimizing the loss function l. For b − 1 trees in the ensemble [short zi=fb1BT(xi) from here], we sum l(yi, pi) over all samples in Dtrain and then add the bth tree in such a way that it further minimizes the loss. Formally, we write the determination of θb* by θb*=argminθbi=1Nl{yi,p[zi+fbRT(xi;θb)]}. Usually, l is minimized by gradient descent, hence the name gradient boosting (for more details at this point, see Friedman 2001). For this reason, we require the gradient of the loss function l with respect to the current predictions zi. Exemplary, for the log loss [Eq. (2)], the negative gradient can be written as residuals:
{lLL[yi,p(zi)]zi}=yipi.
In parallel, for the SME [Eq. (5)], we also required the gradient and trace of the hessian with respect to zi (since XGBoost also considers second-order corrections). After some calculation, the gradient was determined as
lSME(m)zi=2λNCESMp(m)pi(1pi)(*)(μCESMp(m)μtrain(m))(**).
The trace of the hessian can be written as
2lSME(m)(zi)2=2λNCESMp(m)pi(1pi)[(1pi)(μCESMp(m)μtrain(m))+pi(μtrain(m)μCESMp(m))+1NCESMp(m)pi(1pi)].
The different factors in the gradient and hessian allow for deeper insight into what the SME actually enforces. On the one hand, we find that (*)0 for pi → {0, 1}. Consequently, the gradient vanishes for samples where the algorithm is certain about its prediction. On the other hand, we find that (**)0 for μCESMp(m)μtrain(m). Thus, the gradient also disappears if we approximate the observed mean on DCESMp(m). Furthermore, (**) flips the sign of the gradient depending on whether we over or underpredict foehn on DCESMp(m). A similar interpretation can be found for the second-order correction terms in the hessian. Here, the first term makes the model correct its prediction for samples with pi slightly smaller than 0.5. The second term affects samples with pi slightly larger than 0.5. The directions of the second-order corrections are again determined by the sign of μCESMp(m)μtrain(m). The third term is negligible in our case due to the small magnitude of 1/NCESMp(m)104.

REFERENCES

  • Beusch, L., S. Raveh-Rubin, M. Sprenger, and L. Papritz, 2018: Dynamics of a Puelche foehn event in the Andes. Meteor. Z., 27, 6780, https://doi.org/10.1127/metz/2017/0841.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Breiman, L., J. Friedman, C. J. Stone, and R. A. Olshen, 1984: Classification and Regression Trees. CRC Press, 368 pp.

  • Burri, K., B. Dürr, T. Gutermann, A. Neururer, R. Werner, and E. Zala, 2007: Foehn verification with the COSMO model. Int. Conf. on Alpine Meteorology (ICAM), Chambéry, France, Météo-France, Centre National de la Recherche Scientifique, Laboratoire d’Aérologie, Chambéry City Council, European Meteorological Society, and World Meteorological Organization, http://www.agf.ch/doc/AGF_ICAM-2007_e.pdf.

  • Cetti, C., M. Buzzi, and M. Sprenger, 2015: Climatology of Alpine north foehn. Scientific Rep. 100, MeteoSwiss, 76 pp.

  • Chen, T., and C. Guestrin, 2016: XGBoost: A scalable tree boosting system. Proc. 22nd Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, ACM, 785–794.

    • Crossref
    • Export Citation
  • Courvoisier, H. W., and T. Gutermann, 1971: Zur praktischen Anwendung des Föhntests von Widmer. Vol. 21. Arbeitsberichte der MeteoSchweiz, 7 pp.

  • Davis, J., and M. Goadrich, 2006: The relationship between Precision-Recall and ROC curves. Proc. 23rd Int. Conf. on Machine Learning, Pittsburgh, PA,ACM, 233240.

    • Crossref
    • Export Citation
  • Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553597, https://doi.org/10.1002/qj.828.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Drechsel, S., and G. J. Mayr, 2008: Objective forecasting of foehn winds for a subgrid-scale alpine valley. Wea. Forecasting, 23, 205218, https://doi.org/10.1175/2007WAF2006021.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Duerr, B., 2008: Automatisiertes Verfahren zur Bestimmung von Föhn in Alpentälern. Vol. 223. Arbeitsberichte der MeteoSchweiz, 22 pp.

  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 25, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Search Google Scholar
    • Export Citation
  • Gerstgrasser, D., 2017: Dokumentation Südföhn. Tech. Rep., MeteoSwiss, 59 pp. [Available from the authors upon request.]

  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211, https://doi.org/10.1175/1520-0450(1972)011<1203:TUOMOS>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gutermann, T., 1970: Vergleichende Untersuchungen zur Föhnhäufigkeit im Rheintal zwischen Chur und Bodensee. City-Druck AG, 68 pp.

  • Gutermann, T., B. Dürr, H. Richner, and S. Bader, 2012: Föhnklimatologie Altdorf: Die lange Reihe (1864-2008) und ihre Weiterführung, Vergleich mit anderen Stationen. Vol. 241. Fachbericht MeteoSchweiz, 53 pp.

  • Guzman-Morales, J., and A. Gershunov, 2019: Climate change suppresses Santa Ana winds of Southern California and sharpens their seasonality. Geophys. Res. Lett., 46, 27722780, https://doi.org/10.1029/2018GL080261.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hächler, P., K. Burri, B. Dürr, T. Gutermann, A. Neururer, H. Richner, and R. Werner, 2011: Der Föhnfall vom 8: Dezember 2006–eine Fallstudie. Vol. 234. Arbeitsberichte der MeteoSchweiz, 56 pp.

  • Hastie, T., R. Tibshirani, and J. Friedman, 2009: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media, 745 pp.

  • Hoens, T. R., and N. V. Chawla, 2013: Imbalanced datasets: From sampling to classifiers. Imbalanced Learning: Foundations, Algorithms, and Applications, H. He and Y. Ma, Eds., John Wiley & Sons, 43–59.

    • Crossref
    • Export Citation
  • Hogan, R. J., and I. B. Mason, 2012: Deterministic forecasts of binary events. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. Jolliffe and D. Stephenson, Eds., John Wiley & Sons, 31–59.

    • Crossref
    • Export Citation
  • Huguenin, M. F., E. M. Fischer, S. Kotlarski, S. C. Scherrer, C. Schwierz, and R. Knutti, 2020: Lack of change in the projected frequency and persistence of atmospheric circulation types over central Europe. Geophys. Res. Lett., 47, e2019GL086132, https://doi.org/10.1029/2019GL086132.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hurrell, J., and Coauthors, 2013: The Community Earth System Model: A framework for collaborative research. Bull. Amer. Meteor. Soc., 94, 13391360, https://doi.org/10.1175/BAMS-D-12-00121.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kay, J. E., and Coauthors, 2015: The Community Earth System Model (CESM) large ensemble project: A community resource for studying climate change in the presence of internal climate variability. Bull. Amer. Meteor. Soc., 96, 13331349, https://doi.org/10.1175/BAMS-D-13-00255.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kljun, N., M. Sprenger, and C. Schär, 2001: Frontal modification and lee cyclogenesis in the Alps: A case study using the ALPEX reanalysis data set. Meteor. Atmos. Phys., 78, 89105, https://doi.org/10.1007/s007030170008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Laffin, M., C. Zender, S. Singh, and M. van Wessem, 2019: 40 years of föhn winds on the Antarctic peninsula: Impact on surface melt from 1979-2018. 2019 Fall Meeting, San Francisco, CA, Amer. Geophys. Union, Abstract 10501254.1, https://doi.org/10.1002/essoar.10501254.1.

    • Crossref
    • Export Citation
  • Lotteraner, C. J., 2009: Synoptisch-klimatologische Auswertung von Windfeldern im Alpenraum. Ph.D. thesis, University Vienna, 112 pp.

  • Louppe, G., 2014: Understanding random forests: From theory to practice. Ph.D. thesis, Université de Liège, 223 pp.

  • Mayr, G., and Coauthors, 2018: The community foehn classification experiment. Bull. Amer. Meteor. Soc., 99, 22292235, https://doi.org/10.1175/BAMS-D-17-0200.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • McGowan, H., and A. Sturman, 1996: Regional and local scale characteristics of foehn wind events over the South Island of New Zealand. Meteor. Atmos. Phys., 58, 151164, https://doi.org/10.1007/BF01027562.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mehta, P., M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K. Fisher, and D. J. Schwab, 2019: A high-bias, low-variance introduction to machine learning for physicists. Phys. Rep., 810, 1124, https://doi.org/10.1016/j.physrep.2019.03.001.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mercer, A. E., M. B. Richman, H. B. Bluestein, and J. M. Brown, 2008: Statistical modeling of downslope windstorms in Boulder, Colorado. Wea. Forecasting, 23, 11761194, https://doi.org/10.1175/2008WAF2007067.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Miller, N. L., and N. J. Schlegel, 2006: Climate change projected fire weather sensitivity: California Santa Ana wind occurrence. Geophys. Res. Lett., 33, L15711, https://doi.org/10.1029/2006GL025808.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1996: The Finley affair: A signal event in the history of forecast verification. Wea. Forecasting, 11, 320, https://doi.org/10.1175/1520-0434(1996)011<0003:TFAASE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Oard, M. J., 1993: A method for predicting Chinook winds east of the Montana Rockies. Wea. Forecasting, 8, 166180, https://doi.org/10.1175/1520-0434(1993)008<0166:AMFPCW>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pezzatti, G., A. Angelis, and M. Conedera, 2016: Potenzielle Entwicklung der Waldbrandgefahr im Klimawandel. Wald im Klimawandel, A. R. Pluess et al., Eds., Haupt Verlag, 223–245.

  • Plavcan, D., and G. J. Mayr, 2015: Towards an Alpine foehn climatology. 33rd Int. Conf. on Alpine Meteorology, Innsbruck, Austria, Institute of Meteorology and Geophysics, University of Innsbruck, 014.4, https://www.uibk.ac.at/congress/icam2015/abstracts_oral_presentations.htm#O14.4.

  • Plavcan, D., G. J. Mayr, and A. Zeileis, 2014: Automatic and probabilistic foehn diagnosis with a statistical mixture model. J. Appl. Meteor. Climatol., 53, 652659, https://doi.org/10.1175/JAMC-D-13-0267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Raible, C. C., M. Messmer, F. Lehner, T. F. Stocker, and R. Blender, 2018: Extratropical cyclone statistics during the last millennium and the 21st century. Climate Past, 14, 1499–1514, https://doi.org/10.5194/cp-14-1499-2018.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Richner, H., and P. Hächler, 2013: Understanding and forecasting Alpine foehn. Mountain Weather Research and Forecasting: Recent Progress and Current Challenges, F. K. Chow, S. F. De Wekker, and B. J. Snyder, Eds., Springer, 219–260.

    • Crossref
    • Export Citation
  • Richner, H., and B. Duerr, 2015: Facts and fallacies related to dimmerfoehn. Tech. Rep., ETH Zurich, 4 pp., https://doi.org/10.3929/ethz-a-010439615.

    • Crossref
    • Export Citation
  • Richner, H., B. Dürr, T. Gutermann, and S. Bader, 2014: The use of automatic station data for continuing the long time series (1864 to 2008) of foehn in Altdorf. Meteor. Z., 23, 159166, https://doi.org/10.1127/0941-2948/2014/0528.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Röthlisberger, M., M. Sprenger, E. Flaounas, U. Beyerle, and H. Wernli, 2020: The substructure of extremely hot summers in the Northern Hemisphere. Wea. Climate Dyn., 1, 4562, https://doi.org/10.5194/wcd-1-45-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schaller, N., J. Sillmann, J. Anstey, E. M. Fischer, C. M. Grams, and S. Russo, 2018: Influence of blocking on Northern European and Western Russian heatwaves in large climate model ensembles. Environ. Res. Lett., 13, 054015, https://doi.org/10.1088/1748-9326/aaba55.

    • Crossref
    • Export Citation
  • Sharples, J. J., G. A. Mills, R. H. McRae, and R. O. Weber, 2010: Foehn-like winds and elevated fire danger conditions in southeastern Australia. J. Appl. Meteor. Climatol., 49, 10671095, https://doi.org/10.1175/2010JAMC2219.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Simpson, C., H. Pearce, A. Sturman, and P. Zawar-Reza, 2014: Behaviour of fire weather indices in the 2009–10 New Zealand wildland fire season. Int. J. Wildland Fire, 23, 11471164, https://doi.org/10.1071/WF12169.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sprenger, M., B. Dürr, and H. Richner, 2016: Foehn studies in Switzerland. From Weather Observations to Atmospheric and Climate Sciences in Switzerland: Celebrating 100 Years of the Swiss Society for Meteorology, S. Willemse and M. Furger, Eds., vdf Hochschulverlag AG, 215–247.

  • Sprenger, M., S. Schemm, R. Oechslin, and J. Jenkner, 2017: Nowcasting foehn wind events using the AdaBoost machine learning algorithm. Wea. Forecasting, 32, 10791099, https://doi.org/10.1175/WAF-D-16-0208.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Steinacker, R., 2006: Alpiner Föhn—Eine neue Strophe zu einem alten Lied. Vol. 32. Promet, 3–10.

  • Van Vuuren, D. P., and Coauthors, 2011: The representative concentration pathways: An overview. Climatic Change, 109, 531, https://doi.org/10.1007/s10584-011-0148-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vergeiner, J., 2004: South foehn studies and a new foehn classification scheme in the Wipp and Inn valley. Ph.D. thesis, University of Innsbruck, 105 pp.

  • Widmer, R., 1966: Statistische Untersuchungen über den Föhn im Reusstal und Versuch einer objektiven Föhnprognose für die Station Altdorf. Vierteljahrsschr. Natforsch. Ges. Zur., 111, 331375.

    • Search Google Scholar
    • Export Citation
  • Wilhelm, M., M. Buzzi, M. Sprenger, and P. Hächler, 2012: COSMO-2 model performance in forecasting foehn: A systematic process-oriented verification. M.S. thesis, Dept. of Environmental Systems Science, ETH Zürich, 55 pp.

  • WMO, 1992: International Meteorological Vocabulary, Vol. 182. World Meteorological Organization, 784 pp.

  • Zumbrunnen, T., H. Bugmann, M. Conedera, and M. Bürgi, 2009: Linking forest fire regimes and climate—A historical analysis in a dry inner alpine valley. Ecosystems, 12, 7386, https://doi.org/10.1007/s10021-008-9207-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zweifel, L., G. Mayr, and R. Stauffer, 2016: Probabilistic Foehn forecasting for the Gotthard Region based on model output statistics. M.S. thesis, Faculty of Geo- and Atmospheric Sciences, University of Innsbruck, 90 pp.

1

Actually, we evaluated several machine learning algorithms for this task. In detail, we utilized random forests, gradient boosted tree models (i.e., XGBoost, LightGBM, CatBoost), deep neural networks (DNNs), K-nearest neighbors (KNN) with principal component analysis (PCA) preprocessing, and logistic regression with elastic-net regularization. Furthermore, due to the spatial nature of the data, we also employed a convolutional neural network (CNN). In the end, the best performance was obtained by the XGBoost algorithm, even though the results of some other algorithms were not substantially worse. In particular, the CNN approach also worked considerably well. Nevertheless, the benefits of XGBoost models over CNNs originate from less resource-intense training procedures and more straightforward interpretability due to the intrinsic feature importance.

Supplementary Materials

Save
  • Beusch, L., S. Raveh-Rubin, M. Sprenger, and L. Papritz, 2018: Dynamics of a Puelche foehn event in the Andes. Meteor. Z., 27, 6780, https://doi.org/10.1127/metz/2017/0841.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Breiman, L., J. Friedman, C. J. Stone, and R. A. Olshen, 1984: Classification and Regression Trees. CRC Press, 368 pp.

  • Burri, K., B. Dürr, T. Gutermann, A. Neururer, R. Werner, and E. Zala, 2007: Foehn verification with the COSMO model. Int. Conf. on Alpine Meteorology (ICAM), Chambéry, France, Météo-France, Centre National de la Recherche Scientifique, Laboratoire d’Aérologie, Chambéry City Council, European Meteorological Society, and World Meteorological Organization, http://www.agf.ch/doc/AGF_ICAM-2007_e.pdf.

  • Cetti, C., M. Buzzi, and M. Sprenger, 2015: Climatology of Alpine north foehn. Scientific Rep. 100, MeteoSwiss, 76 pp.

  • Chen, T., and C. Guestrin, 2016: XGBoost: A scalable tree boosting system. Proc. 22nd Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, ACM, 785–794.

    • Crossref
    • Export Citation
  • Courvoisier, H. W., and T. Gutermann, 1971: Zur praktischen Anwendung des Föhntests von Widmer. Vol. 21. Arbeitsberichte der MeteoSchweiz, 7 pp.

  • Davis, J., and M. Goadrich, 2006: The relationship between Precision-Recall and ROC curves. Proc. 23rd Int. Conf. on Machine Learning, Pittsburgh, PA,ACM, 233240.

    • Crossref
    • Export Citation
  • Dee, D. P., and Coauthors, 2011: The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Quart. J. Roy. Meteor. Soc., 137, 553597, https://doi.org/10.1002/qj.828.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Drechsel, S., and G. J. Mayr, 2008: Objective forecasting of foehn winds for a subgrid-scale alpine valley. Wea. Forecasting, 23, 205218, https://doi.org/10.1175/2007WAF2006021.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Duerr, B., 2008: Automatisiertes Verfahren zur Bestimmung von Föhn in Alpentälern. Vol. 223. Arbeitsberichte der MeteoSchweiz, 22 pp.

  • Friedman, J. H., 2001: Greedy function approximation: A gradient boosting machine. Ann. Stat., 25, 11891232, https://doi.org/10.1214/aos/1013203451.

    • Search Google Scholar
    • Export Citation
  • Gerstgrasser, D., 2017: Dokumentation Südföhn. Tech. Rep., MeteoSwiss, 59 pp. [Available from the authors upon request.]

  • Glahn, H. R., and D. A. Lowry, 1972: The use of model output statistics (MOS) in objective weather forecasting. J. Appl. Meteor., 11, 12031211, https://doi.org/10.1175/1520-0450(1972)011<1203:TUOMOS>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gutermann, T., 1970: Vergleichende Untersuchungen zur Föhnhäufigkeit im Rheintal zwischen Chur und Bodensee. City-Druck AG, 68 pp.

  • Gutermann, T., B. Dürr, H. Richner, and S. Bader, 2012: Föhnklimatologie Altdorf: Die lange Reihe (1864-2008) und ihre Weiterführung, Vergleich mit anderen Stationen. Vol. 241. Fachbericht MeteoSchweiz, 53 pp.

  • Guzman-Morales, J., and A. Gershunov, 2019: Climate change suppresses Santa Ana winds of Southern California and sharpens their seasonality. Geophys. Res. Lett., 46, 27722780, https://doi.org/10.1029/2018GL080261.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hächler, P., K. Burri, B. Dürr, T. Gutermann, A. Neururer, H. Richner, and R. Werner, 2011: Der Föhnfall vom 8: Dezember 2006–eine Fallstudie. Vol. 234. Arbeitsberichte der MeteoSchweiz, 56 pp.

  • Hastie, T., R. Tibshirani, and J. Friedman, 2009: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media, 745 pp.

  • Hoens, T. R., and N. V. Chawla, 2013: Imbalanced datasets: From sampling to classifiers. Imbalanced Learning: Foundations, Algorithms, and Applications, H. He and Y. Ma, Eds., John Wiley & Sons, 43–59.

    • Crossref
    • Export Citation
  • Hogan, R. J., and I. B. Mason, 2012: Deterministic forecasts of binary events. Forecast Verification: A Practitioner’s Guide in Atmospheric Science, I. Jolliffe and D. Stephenson, Eds., John Wiley & Sons, 31–59.

    • Crossref
    • Export Citation
  • Huguenin, M. F., E. M. Fischer, S. Kotlarski, S. C. Scherrer, C. Schwierz, and R. Knutti, 2020: Lack of change in the projected frequency and persistence of atmospheric circulation types over central Europe. Geophys. Res. Lett., 47, e2019GL086132, https://doi.org/10.1029/2019GL086132.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Hurrell, J., and Coauthors, 2013: The Community Earth System Model: A framework for collaborative research. Bull. Amer. Meteor. Soc., 94, 13391360, https://doi.org/10.1175/BAMS-D-12-00121.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kay, J. E., and Coauthors, 2015: The Community Earth System Model (CESM) large ensemble project: A community resource for studying climate change in the presence of internal climate variability. Bull. Amer. Meteor. Soc., 96, 13331349, https://doi.org/10.1175/BAMS-D-13-00255.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Kljun, N., M. Sprenger, and C. Schär, 2001: Frontal modification and lee cyclogenesis in the Alps: A case study using the ALPEX reanalysis data set. Meteor. Atmos. Phys., 78, 89105, https://doi.org/10.1007/s007030170008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Laffin, M., C. Zender, S. Singh, and M. van Wessem, 2019: 40 years of föhn winds on the Antarctic peninsula: Impact on surface melt from 1979-2018. 2019 Fall Meeting, San Francisco, CA, Amer. Geophys. Union, Abstract 10501254.1, https://doi.org/10.1002/essoar.10501254.1.

    • Crossref
    • Export Citation
  • Lotteraner, C. J., 2009: Synoptisch-klimatologische Auswertung von Windfeldern im Alpenraum. Ph.D. thesis, University Vienna, 112 pp.

  • Louppe, G., 2014: Understanding random forests: From theory to practice. Ph.D. thesis, Université de Liège, 223 pp.

  • Mayr, G., and Coauthors, 2018: The community foehn classification experiment. Bull. Amer. Meteor. Soc., 99, 22292235, https://doi.org/10.1175/BAMS-D-17-0200.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • McGowan, H., and A. Sturman, 1996: Regional and local scale characteristics of foehn wind events over the South Island of New Zealand. Meteor. Atmos. Phys., 58, 151164, https://doi.org/10.1007/BF01027562.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mehta, P., M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, C. K. Fisher, and D. J. Schwab, 2019: A high-bias, low-variance introduction to machine learning for physicists. Phys. Rep., 810, 1124, https://doi.org/10.1016/j.physrep.2019.03.001.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mercer, A. E., M. B. Richman, H. B. Bluestein, and J. M. Brown, 2008: Statistical modeling of downslope windstorms in Boulder, Colorado. Wea. Forecasting, 23, 11761194, https://doi.org/10.1175/2008WAF2007067.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Miller, N. L., and N. J. Schlegel, 2006: Climate change projected fire weather sensitivity: California Santa Ana wind occurrence. Geophys. Res. Lett., 33, L15711, https://doi.org/10.1029/2006GL025808.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Murphy, A. H., 1996: The Finley affair: A signal event in the history of forecast verification. Wea. Forecasting, 11, 320, https://doi.org/10.1175/1520-0434(1996)011<0003:TFAASE>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Oard, M. J., 1993: A method for predicting Chinook winds east of the Montana Rockies. Wea. Forecasting, 8, 166180, https://doi.org/10.1175/1520-0434(1993)008<0166:AMFPCW>2.0.CO;2.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Pezzatti, G., A. Angelis, and M. Conedera, 2016: Potenzielle Entwicklung der Waldbrandgefahr im Klimawandel. Wald im Klimawandel, A. R. Pluess et al., Eds., Haupt Verlag, 223–245.

  • Plavcan, D., and G. J. Mayr, 2015: Towards an Alpine foehn climatology. 33rd Int. Conf. on Alpine Meteorology, Innsbruck, Austria, Institute of Meteorology and Geophysics, University of Innsbruck, 014.4, https://www.uibk.ac.at/congress/icam2015/abstracts_oral_presentations.htm#O14.4.

  • Plavcan, D., G. J. Mayr, and A. Zeileis, 2014: Automatic and probabilistic foehn diagnosis with a statistical mixture model. J. Appl. Meteor. Climatol., 53, 652659, https://doi.org/10.1175/JAMC-D-13-0267.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Raible, C. C., M. Messmer, F. Lehner, T. F. Stocker, and R. Blender, 2018: Extratropical cyclone statistics during the last millennium and the 21st century. Climate Past, 14, 1499–1514, https://doi.org/10.5194/cp-14-1499-2018.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Richner, H., and P. Hächler, 2013: Understanding and forecasting Alpine foehn. Mountain Weather Research and Forecasting: Recent Progress and Current Challenges, F. K. Chow, S. F. De Wekker, and B. J. Snyder, Eds., Springer, 219–260.

    • Crossref
    • Export Citation
  • Richner, H., and B. Duerr, 2015: Facts and fallacies related to dimmerfoehn. Tech. Rep., ETH Zurich, 4 pp., https://doi.org/10.3929/ethz-a-010439615.

    • Crossref
    • Export Citation
  • Richner, H., B. Dürr, T. Gutermann, and S. Bader, 2014: The use of automatic station data for continuing the long time series (1864 to 2008) of foehn in Altdorf. Meteor. Z., 23, 159166, https://doi.org/10.1127/0941-2948/2014/0528.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Röthlisberger, M., M. Sprenger, E. Flaounas, U. Beyerle, and H. Wernli, 2020: The substructure of extremely hot summers in the Northern Hemisphere. Wea. Climate Dyn., 1, 4562, https://doi.org/10.5194/wcd-1-45-2020.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schaller, N., J. Sillmann, J. Anstey, E. M. Fischer, C. M. Grams, and S. Russo, 2018: Influence of blocking on Northern European and Western Russian heatwaves in large climate model ensembles. Environ. Res. Lett., 13, 054015, https://doi.org/10.1088/1748-9326/aaba55.

    • Crossref
    • Export Citation
  • Sharples, J. J., G. A. Mills, R. H. McRae, and R. O. Weber, 2010: Foehn-like winds and elevated fire danger conditions in southeastern Australia. J. Appl. Meteor. Climatol., 49, 10671095, https://doi.org/10.1175/2010JAMC2219.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Simpson, C., H. Pearce, A. Sturman, and P. Zawar-Reza, 2014: Behaviour of fire weather indices in the 2009–10 New Zealand wildland fire season. Int. J. Wildland Fire, 23, 11471164, https://doi.org/10.1071/WF12169.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sprenger, M., B. Dürr, and H. Richner, 2016: Foehn studies in Switzerland. From Weather Observations to Atmospheric and Climate Sciences in Switzerland: Celebrating 100 Years of the Swiss Society for Meteorology, S. Willemse and M. Furger, Eds., vdf Hochschulverlag AG, 215–247.

  • Sprenger, M., S. Schemm, R. Oechslin, and J. Jenkner, 2017: Nowcasting foehn wind events using the AdaBoost machine learning algorithm. Wea. Forecasting, 32, 10791099, https://doi.org/10.1175/WAF-D-16-0208.1.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Steinacker, R., 2006: Alpiner Föhn—Eine neue Strophe zu einem alten Lied. Vol. 32. Promet, 3–10.

  • Van Vuuren, D. P., and Coauthors, 2011: The representative concentration pathways: An overview. Climatic Change, 109, 531, https://doi.org/10.1007/s10584-011-0148-z.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Vergeiner, J., 2004: South foehn studies and a new foehn classification scheme in the Wipp and Inn valley. Ph.D. thesis, University of Innsbruck, 105 pp.

  • Widmer, R., 1966: Statistische Untersuchungen über den Föhn im Reusstal und Versuch einer objektiven Föhnprognose für die Station Altdorf. Vierteljahrsschr. Natforsch. Ges. Zur., 111, 331375.

    • Search Google Scholar
    • Export Citation
  • Wilhelm, M., M. Buzzi, M. Sprenger, and P. Hächler, 2012: COSMO-2 model performance in forecasting foehn: A systematic process-oriented verification. M.S. thesis, Dept. of Environmental Systems Science, ETH Zürich, 55 pp.

  • WMO, 1992: International Meteorological Vocabulary, Vol. 182. World Meteorological Organization, 784 pp.

  • Zumbrunnen, T., H. Bugmann, M. Conedera, and M. Bürgi, 2009: Linking forest fire regimes and climate—A historical analysis in a dry inner alpine valley. Ecosystems, 12, 7386, https://doi.org/10.1007/s10021-008-9207-3.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Zweifel, L., G. Mayr, and R. Stauffer, 2016: Probabilistic Foehn forecasting for the Gotthard Region based on model output statistics. M.S. thesis, Faculty of Geo- and Atmospheric Sciences, University of Innsbruck, 90 pp.

  • Fig. 1.

    Grid points of the CESM grid (blue). While the ERAI grid is defined on full degrees (intersection of parallels and meridians), CESM is slightly coarser and shifted. The locations of Altdorf (ALT) and Lugano (LUG) are marked in red. The olive points indicate the grid points in the Alps, which we removed to achieve better generalization to CESM (see section 3 for details).

  • Fig. 2.

    Topography of (a) ERAI and (b) CESM in terms of averaged surface pressure. The locations of the south foehn station Altdorf (ALT) and the north foehn station Lugano (LUG) are shown in white. The black dotted frame indicates the area within which we removed the features for the CESM prediction (see section 3 for details).

  • Fig. 3.

    Shown are the mean SLP conditions during south foehn for (a) ERAI test samples with observed foehn, (b) ERAI test samples with predicted foehn, (c) CESM-p samples with predicted foehn, and (d) CESM-f samples with predicted foehn. Furthermore, the most important ΔSLPs are marked. The thicker the blue line, the more important is the specific difference. Due to the large feature importance of two features, here we depict the root of the features importance for visibility. The green dot marks the location of Altdorf.

  • Fig. 4.

    As in Fig. 3, but for north foehn in Lugano. Note that here we did not scale the ΔSLP (blue lines) with the root of the feature importance.

  • Fig. 5.

    (a) Shown are the present-day observed or predicted monthly south foehn frequencies in Altdorf for the observed (10 samples), ERAI (10 samples), and CESM-p (350 samples) dataset. (b) Shown are the decadal means of south foehn frequency in Altdorf for each CESM-p and CESM-f ensemble member, i.e., we compare 35 potential decades of present-day climate with 35 decades of future climate here. Also shown are the p values from the Wilcoxon rank-sum test and their significance under the global α = 0.05 with Bonferroni correction. All significant results are marked with an asterisk.

  • Fig. 6.

    As in Fig. 5, but for north foehn in Lugano.

All Time Past Year Past 30 Days
Abstract Views 385 0 0
Full Text Views 362 233 35
PDF Downloads 332 175 30