1. Introduction
To investigate climate change and its consequences, general circulation models (GCMs) are among the essential tools. However, the large spatial resolution limits the potential benefits of GCMs in regional applications (Lun et al. 2020). Consequently, spatial downscaling is required to efficiently utilize the information that could be obtained from GCMs. Downscaling methods are generally divided into dynamical and statistical methods. The dynamical downscaling methods utilize GCM fields, which can include regional conditions (e.g., orographic forcings) as boundary conditions to numerically calculate differential equations of physical processes. These regional climate models (RCMs) are very similar to the GCMs in simulating the physical climate processes, while covering a limited area of the globe and running in a much finer resolution. However, limits in the knowledge of the climate systems, the sensitiveness of process-based models to data quality, and uncertainties in both GCM and observed data are challenges of employing dynamical downscaling methods. Moreover, RCMs include the biases related to the mother GCM, and the results are still not accurate for extreme events. On the other hand, in the statistical downscaling methods, functions are created to map the predictors (coarse grid GCM outputs) to predictands (regional observed data). Statistical downscaling methods are commonly used when data and knowledge are insufficient, or descriptions of the underlying physics of the system are not required (Xu et al. 2019; Maraun and Widmann 2018).
Recently, deep learning (DL) methods, which are capable of extracting complex patterns in multidimensional data, have also been employed for statistical downscaling (Nourani et al. 2018; Li and Babovic 2019; Sha et al. 2020; Sun and Lan 2021). Long short-term memory (LSTM), as such a DL-based method, is a recurrent neural network to explore long-term dependencies of data (Hochreiter and Schmidhuber 1997); with respect to downscaling, Tran Anh et al. (2019) compared the accuracy of LSTM and a feedforward neural network (FFNN) for statistical downscaling of precipitation. Statistical downscaling approaches are mostly based on point predictions. A point prediction is defined as mapping a single predictor or a set of predictors to a single predictand in each time step (Wang et al. 2020). The main goal of this approach is to minimize the error between observed and computed values; in other words, to enhance the accuracy of the statistical downscaling model.
However, climate change modeling can be affected by various uncertainties that are also involved in statistical downscaling. Characterizing and quantifying these uncertainties is of prime importance in the investigation of climate change and its impact (Maraun and Widmann 2018; Ehteram et al. 2018; Shamshirband et al. 2019). Uncertainties stem from climate models, and future scenarios can be assessed with multi-GCM and multiscenario ensemble approaches (Nourani et al. 2019; Wang et al. 2019). However, uncertainties evolving in statistical downscaling, climate’s internal variability, errors in observational data, and uncertainty of model parameters are not assessable via point prediction methods.
To address the uncertainty involved in modeling with artificial neural networks (ANNs), prediction intervals (PIs) can be estimated and used. In point prediction, a single prediction and an error of the prediction are generated. However, these values do not provide any information on uncertainties included in the modeling and how likely a future prediction is to be close to an observed value. On the other hand, PIs consist of two bounds, the lower and upper bounds. Independent predictors are used for estimating PIs for a dependent predictand, such that a future observation of a predictand may fall within the bounds while satisfying the specified confidence level. Hence, a set of PIs inform the modeler about the range of future events with a specified confidence level. Therefore, PIs can address aleatoric uncertainty, such as noise in the data, as well as epistemic uncertainty, such as modeling simplifications. Consequently, PIs are dealing with more sources of uncertainty and are always wider than confidence intervals (Khosravi et al. 2011).
According to the literature, different classic PI construction techniques have already been proposed and applied in different engineering fields such as mean–variance estimation (MVE), provided by Nix and Weigend (1994); delta (de Veaux et al. 1998); and bootstrap, proposed by Efron (1979). The Hessian matrix calculation is time consuming in the delta method. Also, the bootstrap method has a high computational cost for large datasets. In addition to problematic implementation, applying these methods includes basic assumptions such as a homogeneous and customarily distributed noise for the delta and MVE methods (Khosravi et al. 2011; Kabir et al. 2018). In most of such classic methods, a prior probability distribution should be assigned to the parameter; however, due to the stochastic nature of hydroclimatic parameters, say precipitation, the parameter may have different distributions in different climatic regions (e.g., see Fig. 1, which shows the probability distribution of precipitation for the two case studies of this research—Tabriz and Rasht, Iran), and it is a challenging task to find such a generalized global distribution for a climatic parameter.
To cope with the abovementioned drawbacks of classic PI estimation methods, recently, the lower upper bound estimation (LUBE) method has been developed and used in different fields of engineering (Khosravi et al. 2011; Kabir et al. 2018) as well as hydroclimatology (Nourani et al. 2019, 2022). LUBE provides several advantages over other ANN-based uncertainty quantification methods such as a parameter and data distribution-free method. Unlike the other PI methods such as the delta and Bayesian methods, which are computationally intensive, the LUBE method is considered computationally efficient among PI methods (Khosravi et al. 2011). LUBE-based approaches could be successfully linked to the DL-based ANN to model hydroclimatic processes (e.g., Taormina and Chau 2015), flow prediction (Yan et al. 2021), and groundwater modeling (Nourani et al. 2022). Although distribution-based DL methods are already employed in statistical downscaling (see Vandal et al. 2018; Baño-Medina et al. 2020), to the best of the authors’ knowledge, there is no study using a distribution-free and prognostic LSTM-LUBE method, to include long-term dependencies of time series, in statistical downscaling of hydroclimatic parameters. Therefore, despite the highly stochastic nature of the precipitation data, autoregressive features of the process can be explored and included in the modeling via the proposed method.
In this paper, output data of a single GCM [Canadian Earth System Model, version 5 (CanESM5)] with a single future scenario, namely, shared socioeconomic pathway 5-8.5 (SSP5-8.5), which is the most pessimistic scenario among other scenarios (SSP1-1.9, SSP1-2.6, etc.), were employed to quantify the uncertainties stemming from downscaling via classic FFNN and DL-based LSTM methods. In this regard, the observed data and modeling results of two different stations with different climates and different proximity to the global grid points were assessed via the LUBE method. Ultimately, calibrated models were employed for future projections. Since there are no observational data for the future to evaluate the projections, a comparison of projected values from the proposed downscaling method with outputs of an RCM was proposed.
This study had the following objectives:
-
Evaluation of the newly proposed LSTM-LUBE method for uncertainty qualification of statistical downscaling of different climatic parameters at different climatic regions.
-
Assessment of the climate change impact on projections of climatic parameters in both point prediction and prediction interval tasks.
-
Comparison of classic and DL-based ANNs for downscaling of climatic parameters at different regions.
-
Comparison of statistical downscaling projections for future and RCM projections.
-
Analysis of the potential climate changes for the near and far future in Tabriz and Rasht, via point prediction and PIs projections.
2. Materials and methods
a. Case study and used data
In this study, observational data from synoptic stations of two cities with different climate regions (i.e., Tabriz and Rasht; see Fig. 2) were used to evaluate the capability of the proposed methods. Tabriz is located in the northwest of Iran (38.07°N, 46.14°E) between the Einali and Sahand Mountains at 1364 m above sea level and is an industrial city. Moreover, Tabriz is considered a semiarid region by the De Martonne aridity index. Mean annual precipitation as well as the temperature of Tabriz have been measured as 290 mm and 13°C, respectively. Contradictorily, Rasht is classified as an extremely wet region by the De Martonne index and has 1500-mm annual mean precipitation and 16°C average temperature. This city is located in the north of Iran (37.16°N, 47.35°E) between the Alborz Mountains and the Caspian Sea. Meteorological data of Tabriz from 1951 to 2014 and of the Rasht station from 1977 to 2014 were used in this study, where the GCM data are available up to 2014.
Predictor data employed in this study were taken from CanESM5, developed by the Canadian Centre for Climate Modelling and Analysis (CCCMA), under phase 6 of the Coupled Model Intercomparison Project (CMIP6). The historical simulations of CMIP6 use natural and anthropogenic forcings over the period 1850–2014. CMIP6 consists of advanced Earth system models (ESMs), including historical land use, greenhouse gas emission, concentrations, specifications, and related future scenarios (Tokarska et al. 2020). CMIP6 outputs have been proven in previous studies to be suitable for statistical downscaling over the region of this study (e.g., Nourani et al. 2018; Raju and Kumar 2020; Tabari et al. 2021). Among other GCMs under CMIP6, CanESM5 showed higher accuracy in statistical downscaling studies in the region; therefore, it was selected for this study (Nourani et al. 2019, 2023).
Although there are different interpretations, perfect prognosis is a term in the climatology and statistical downscaling context that refers to a downscaling approach that satisfies at least three assumptions: (i) predictors must be bias free to the present climate observations; (ii) predictors must be informative and explain local variability; and (iii) the statistical model must consist of a reasonable connection between predictors and predictands (Maraun and Widmann 2018). In previous studies, it has been proved that calibration of statistical downscaling models with reanalysis data as inputs, to fulfill the perfect-prognosis assumption, may result in more accurate downscaling outcomes (Hosseini Baghanam et al. 2019). However, a GCM output dataset was used in this study for inputs of the models since such a perfect-prognosis approach still cannot correct model biases and the projections for the future may become biased too (Maraun and Widmann 2018). There is no separate bias correction step in statistically downscaling via black-box models such as ANNs. Therefore, if reanalysis data are used for the model calibration and then GCM’s future scenario data are used for projections, the model will be blind to any bias between the GCM outputs and the reanalysis data. In addition, for long lead times (e.g., year 2100), the uncertainty of projections is mostly related to the uncertainty of future scenarios rather than internal variability. Hence, an exact temporal correspondence, such as the temporal correspondence between reanalysis data and observational data, will not reduce the uncertainty of projections significantly. Consequently, GCM data were used for downscaling purposes in this study [for more details about such uncertainty sources, see Maraun and Widmann (2018)].
The GCM-based monthly historical and SSP5-8.5 future scenario data with a resolution of 2.8° (equals to approximately 200 km for the study region; see Fig. 2) were extracted from the Lawrence Livermore National Laboratory website (Swart et al. 2019; https://esgf-node.llnl.gov). Because of the similarities in outputs of different variants (Swart et al. 2019) and the scope of this study, only outputs of one variant label (i.e., r1i1p1f1) were used here. Among available socioeconomic scenarios (SSP1-1.9, SSP1-2.6, etc.), SSP5-8.5, as the most pessimistic scenario, was selected for this study. The rationale for this selection is to assess the consequences of the worst possible climate change impact, based on socioeconomic developments, fossil fuel driven with high forcing (Riahi et al. 2017). For each synoptic station, data from the four nearest grid points were considered as potential predictors. Among the vast variety of available model outputs, 120 variables (some variables in different pressure levels) that were relevant to the region and scope of this study were selected as potential predictors. A list of model outputs of CanESM5 is available in the appendix section of Swart et al. (2019).
Because of the absence of future observational data, there is no explicit approach for evaluating the accuracy of the future projection results of the proposed statistical downscaling methodology. However, available RCM outputs (i.e., CanRCM4 under the rcp8.5 scenario of CMIP5) were employed to provide complementary and comparative information besides the statistical downscaling projections.
b. Developed and used method
This research aimed to statistically downscale the precipitation and temperature parameters via ANNs and to quantify the uncertainty evolving while downscaling. Thereafter, future projections of precipitation and temperature were estimated. In this way, a three-step procedure was followed (Fig. 3). At the first step, predictor screening took place via a clustering-based analysis and mutual information (MI) criterion [section 2b(1)] to reduce the number of predictors. In the second step, statistical downscaling models were calibrated. For this purpose, ANNs designed for point prediction and PIs were trained and validated via observed data [section 2b(2)]. In the final step, calibrated models were employed to project temperature and precipitation values for the future and, to provide complementary information about the reliability of projected values for decision-makers, the results were compared with the projection outputs of an RCM [section 2b(3)].
1) Predictor screening
The potential predictors are GCM outputs in which there are no discontinuities, noise, or temporal shifts problems in the data. Moreover, a normalizing preprocessing was applied to the data, which eliminates the scale difference problem among the data. To describe local variables, which are stochastic and highly affected by internal variability, using multiple predictors can enhance accuracy. For instance, when precipitation occurs, supersaturated moist air must form. Thus, humidity variables can be informative for downscaling precipitation. There are a large number of potential predictors (i.e., 120 variables in each of the four grid points around each synoptic station) that can be employed in the statistical downscaling process. While informative data in surrounding grid points can promote modeling accuracy, the redundant information may interrupt the efficiency of databased models. Therefore, determining the influential predictors is of prime importance. To this end, it is necessary to conduct 2480 − 1 repetitions of trial and error to achieve the dominant predictors, which is not wise. Moreover, exploring and visualizing GCM outputs revealed that some parameters follow similar signal-alike daily patterns, with differences in the amplitude, scales, and phase. However, some other GCM outputs show a completely different time series, such as precipitation output, which is naturally stochastic. Therefore, a dominant predictor selection step based on time series distance clustering was used (Aghabozorgi et al. 2015). In this regard, GCM time series were utilized as vectors (value in the ith time step as the ith entry), and the K-means clustering method (MacQueen 1967) was applied to classify similar predictors in the same cluster. It should be noted that the optimum cluster number could be obtained by silhouette measure (Halkidi et al. 2001). Subsequently, the center of each cluster was selected as the representative predictor. Then, the centers of clusters with high relevancy to predictands were selected as dominant predictors. To do so, the MI measure (Yang et al. 2000) was employed to determine the nonlinear relation between predictors (center of clusters) and predictands. It is worth mentioning that the MI measure, based on entropy (Shannon 1948), is proven to be more informative than the linear correlation for statistical downscaling (Nourani et al. 2019).
2) Statistical downscaling
For both point prediction and PI models, a split sampling approach was used. Since the LSTM model aims to extract the temporal characteristics of the data, the dataset is not shuffled, and the sequence of the data is preserved. Moreover, since this study aimed to assess climate change impacts in the future, 80% of data from the earliest time steps of the historical dataset were used for calibration, whereas the other 20% from the latest time steps were utilized for validation of the models. Consequently, the validation subset could be interpreted as the future of the training dataset, and this allows us to validate the model out of training.
Two ANN structures, FFNN and LSTM, were used in this study. FFNN, which is a wieldy technique used in different fields of engineering, maps a flattened input layer to one or a set of targets through nonlinear activation functions. FFNN was selected to use in this study due to the higher accuracy of FFNN over other machine learning (ML) tools in statistical downscaling. For modeling by FFNN, selected dominant predictors in each time step were utilized as inputs for the predictand (i.e., observational climatic parameter) in the same time step (e.g., selected monthly GCM output values in December 2000 were used as inputs for a monthly average of observed temperature or precipitation in December 2000; see Fig. 4a). LSTM is a DL method for including long-term dependencies in the modeling and has recently been applied to hydrological modeling (e.g., Xiang and Demir 2020; Sit et al. 2020; Nourani et al. 2022) as well as statistical downscaling of GCMs (Tran Anh et al. 2019). Because of the weak temporal correspondence between predictors (which are GCM outputs) and predictands (which are observations), LSTM was employed to add the effect of including previous time steps of predictors on statistical downscaling. In this order, a sequence-to-sequence LSTM was trained by utilizing predictor time series (i.e., a multichannel sequence) as an input sequence and the predictand time series as a single-channel sequence target (Fig. 4b).
After the training step of downscaling with LSTM, the hidden and cell states were updated at the validation step, while the predictions were made in sequential order of the data. However, the hidden and cell states, which carry the information from previous time steps, were padded to zero in the first prediction of a sequence and got updated at each prediction. Thus, it was expected that the model would result in low accuracy at the first time steps as the edge effect. Therefore, a warm-up period of 2 yr was considered in the training step, which was included in the modeling process but not in the overall results. Furthermore, to include and quantitatively evaluate the underlying uncertainty in the statistical downscaling, and to provide reliable PIs with certain confidence levels, the LUBE method was applied to employed ANNs. Particularly, two confidence levels were used in this study to assess the effect of changing the confidence level on the quality (i.e., width) of the PIs. The details of the LUBE method are discussed in section 2c.
3) Projections for future
After calibration of the models for point predictions and PIs, the projections of the precipitation and temperature parameters were conducted via the SSP5-8.5 scenario as the most pessimistic socioeconomic scenario, to assess the consequences of the worst possible scenario. For this purpose, the same dominant predictor variables from the SSP5-8.5 were imposed into the calibrated models to produce prediction values for near (2046–50) and far (2096–2100) futures. Moreover, it is notable that while projecting by the LSTM model, a warm-up period of 2 yr was also applied for each future period of projections, to let the recurrent states reach the optimal conditions. In addition, to minimize the edge effect, hidden states of the LSTM cell were manually padded to the precipitation value of the nearest grid point (i.e., A1 for the Tabriz and A5 for the Rasht stations), in the 2-yr warm-up period for future projections. To this end, precipitation flux (pr) in the mentioned grid points was converted to the precipitation in millimeters (1 kg m−2 s−1 = 86 400 mm day−1) and then normalized with respect to the training precipitation data.
Projected values were compared with the outputs of CanRCM4, which is a dynamical downscaling attempt. Although CanRCM4 has a finer resolution, it is based on CanESM, version 2, and the older CMIP5. Therefore, CanRCM4 outputs do not benefit from the advances in the latest CMIP6. Moreover, available grid points of CanRCM4 outputs are not located in the selected cities. However, urban areas can have different climate characteristics when compared with nearby rural areas or woodlands (e.g., urban heat island; Kim 1992). Hence, CanRCM4 outputs are not reliable enough to validate the future projections provided in this study but may provide complementary information about the projected climatic variables. The closest grid point for each synoptic station is selected from RCM (Fig. 2).
c. Prediction interval estimation by LUBE method
The LUBE method is a two-stage training method for PIs. First, an ANN with two outputs is designed. Then, the ANN is trained with gradient descent–based algorithms while both outputs are set to intercept a single observation as the target. This stage is not mandatory, but it is an efficient initialization for the network in terms of promoting the performance of the second stage. For calibration of the ANNs designed for estimating PIs and evaluation of the estimated PIs, the coverage width criterion (CWC) could be utilized.
CWC consists of two inversely related penalty functions: normalized mean prediction interval width (NMPIW) and prediction interval coverage probability (PICP). NMPIW measures the width of PIs. Therefore, a lower NMPIW should indicate more efficient solutions. However, narrow PIs may result in covering less observation (targets), which makes PIs undesirable. PICP measures the ratio of covered targets between the PI bounds to the entire number of targets. Consequently, PICP values closer to 1 are desirable, while to achieve such PICPs, the width of PIs may increase. Despite the drawbacks of using NMPIW and PICP as a single penalty function, CWC is a compound penalty function, in which a low CWC can ensure both high PICP and low NMPIW for a set of estimated PIs (Khosravi et al. 2011). The detailed calculations of the mentioned penalty functions are discussed in section 2d.
Since CWC is not differentiable, ANNs with CWC as the penalty function are not trainable with gradient descent algorithms. Thus, in the second stage, a metaheuristic optimization scheme [e.g., genetic algorithm (GA), or simulated annealing (SA)] could be applied on designed ANNs to minimize the CWC. These algorithms are capable of optimizing a set of parameters only with calculated penalty values for each set of generated parameters, and they do not need any derivative calculations. An SA scheme as one of the effective and reliable optimization tools was employed in this study (Kirkpatrick et al. 1983), whereas GA is a bit more of a computational burden when it is used for calibration of large numbers of weights involved in LSTM networks (Nourani et al. 2019). At this stage, the extracted parameters (weights and biases) from the trained ANNs were used as initialization values for the optimization. The double-output ANN and CWC could be considered objective and penalty functions for the optimization algorithm, respectively.
At each iteration, first, a prediction is made with predictor data of the training dataset and the current parameters, which results in two outputs per each time step of predictands. These outputs are the lower and the upper bounds, and each pair is called a PI. Next, CWC is calculated with predictand data of the training dataset and predicted PIs. Then, the updated parameter set is created via random perturbation in a uniformly random direction around the previous set of parameters, with a step length equal to the square root of a parameter called the temperature. In the Boltzmann probability factor, the temperature parameter determines the probability of accepting the worse solution, and a higher temperature can lead to exploring extended solution space to avoid getting stuck in local solutions and reaching a global optimum answer (Kirkpatrick et al. 1983). A new set of PIs is estimated with the new set of parameters to calculate the new CWC. Next, by comparing the new CWC and the previous CWC, the algorithm decides on which set of parameters to pass to the next iteration. The algorithm keeps the new generated parameters if the CWC, computed by these new parameters, has a lower value than the prior iteration. However, if the previous CWC is the lower one, the algorithm decides on which set of parameters to be kept via the Boltzmann probability factor. Temperature gets updated at every iteration and gets reduced to limit the uphill exploration. After reaching a CWC lower than a specified threshold CWC or after reaching the specified maximum iteration number, the algorithm stops and provides the variable set of optimum CWC. Then this set of weights and biases is used to construct an ANN and estimate PIs for the validation dataset. Last, the CWC for predicted PIs of the validation dataset can be calculated and reported. The CWC can be changed by a small alteration of the network parameters and in this study, the algorithm exploring range for all weights was limited to ±10% of the initialization values. Moreover, to prevent overfitting and the tendency of the algorithm to keep lower values of the temperature parameter, a maximum iteration epoch number was considered for termination of the training process instead of a termination based on the penalty criterion.
d. Evaluation criteria
3. Results and discussion
a. Result of predictor screening
An optimum silhouette value of 0.65 was computed via clustering of the GCM data into five classes. Selected GCM output parameters via K-means clustering and MI measure for statistical downscaling are reported in Table 1. As shown in Table 1, for precipitation of the Rasht station, precipitation values were not selected as dominant predictors. The GCM output Pr, which is the precipitation flux, is treated as the other GCM output, and it is not selected as a cluster center. Moreover, there might be predictors with higher MI values, which are not centers of cluster, while the MI values are only calculated between centers of clusters and the predictands. The range of calculated MI values was between 0.1 and 1.31. Instead of selecting a threshold to select the dominant predictors, five of the cluster centers with higher MI are used in the modeling for each parameter.
Determined input parameters for downscaling of temperature and precipitation parameters for both case studies via CanESM5 GCM.
MI values of potential predictors with precipitation data are relatively low, which shows the high stochastic nature of precipitation. Moreover, MI values for the precipitation of Rasht were higher than the precipitation of Tabriz, which resulted from the different distributions of precipitation data in the two regions. The humidity-related Hurs and Huss parameters of the near GCM, which point to Tabriz and also to Urmia Lake, were selected for precipitation downscaling in Tabriz. It is obvious that a large water body such as Urmia Lake has a significant impact on the region’s climate, and the humidity variables near the lake can be informative indicators for downscaling. Moreover, considering that a large part of Urmia Lake is situated in the south of Tabriz, Vas, which is the northward wind, can represent the amount of humidity being transferred from the lake to Tabriz. In addition, the temperature at around 2000-m altitude (Ta 75 000) can indicate the conditions of the clouds that can produce rain. However, for downscaling precipitation in Rasht, pressure-related parameters (Ps and Psl) were also selected. Pressure in two grid points can indicate the general flow direction of the wind and humidity.
For downscaling temperature, temperature-related variables (Ta and Tas) were selected for both stations, which resulted in high MI values. Generally, the temperature likely has a less stochastic nature. Thus, MI values calculated for the temperature data were significantly higher than precipitation data for both cities. Moreover, the temperature in different locations and altitudes can be the most informative variables for downscaling the temperature. Evaporation flux (evspsbl) and upwelling longwave flux (Rlus) are also directly correlated values with the temperature that has a nonlinear correlation too. Additionally, cloud area fraction (Clt) can also be a deterministic predictor that affects the wave passage in the region. Therefore, Clt was also selected as a candidate predictor for efficient modeling.
b. Result of point predictions
Optimum values for the hyperparameters, hidden neuron/cell number, and epoch number, obtained via the trial-and-error process are shown in Table 2. Clearly, the LSTM model resulted in more accurate outcomes having a higher number of hidden neurons, while for the FFNN model, the best outcomes were obtained using less hidden neurons. The Levenberg–Marquardt algorithm and the adaptive moment estimation algorithm were used for training of FFNN and LSTM models, respectively.
The obtained optimal hyperparameters of ANNs for point prediction in downscaling precipitation and temperature for the Tabriz and Rasht stations.
The results of downscaling via LSTM have also shown that the accuracy of the model is significantly lower for the first time steps of the training subset due to the edge effect (Fig. 5). Moreover, it is notable that the LSTM model needs a longer warm-up period than other RNNs, due to the cell state that represents the long-term memory of the LSTM cell. In hydroclimatic studies in which the data have a seasonal pattern, the cell state needs at least one or two cycles (i.e., 2 yr) of prediction to reach the optimum conditions, as shown in Fig. 5.
In the first approach, statistical downscaling was done via point predictions, where the uncertainty of predictions was not assessable. The results of this approach, which are tabulated in Table 3, show that the LSTM model was more capable of downscaling with higher accuracy. The reason for this higher accuracy lies in the recurrent states, especially the cell state, which includes the long-term correlations in the predictions. Thus, LSTM was capable of utilizing the observations (predictands) of the previous time steps in predictions, and it could handle the weak temporal correspondence between the GCM’s outputs and the local observations.
Results of point predictions for both case studies and for both climatic parameters.
Moreover, both models showed higher accuracy in downscaling temperature than precipitation, which is in line with the previous studies (Nourani et al. 2019; Karger et al. 2020; Jiang et al. 2020). The main reason for such an outcome stemmed from the uncertainty and high internal variability found in the precipitation phenomenon (Almazroui et al. 2020). Consequently, precipitation data are highly skewed data with a frequent zero value in observations, which makes the prediction of the extreme values even more challenging. On the other hand, the temperature data show more autoregressive behavior. Thus, models were capable of making high-accuracy predictions, especially the LSTM model, which can utilize the autoregressive characteristics of the time series autonomously.
Taylor diagrams in Fig. 6 present the correlation coefficient (CC) for gauging similarity between observed and calculated values on the azimuthal angle and the RMSE value with the distance from observations’ standard deviation, which is located on the x axis; the standard deviation of the calculated values is proportional to the radial distance from the origin. As shown in Table 3, the RMSE value of the precipitation downscaling for Rasht is relatively high. RMSE is a deviation-based evaluation criterion. Therefore, as shown in Fig. 6, a high RMSE in precipitation downscaling for Rasht is highly correlated with a high standard deviation of observational precipitation data, and RMSE for the case of precipitation in Tabriz follows a similar ratio as the standard deviation of the precipitation data of Tabriz, although with lower RMSE and standard deviation values. Moreover, as shown in Table 3 and Fig. 6, dimensionless criteria (i.e., NSE and CC) show very similar accuracy in modeling, which also suggests that the higher RMSE is highly related to the high standard deviation of the data. Moreover, precipitation data have a larger standard deviation than the predictions obtained by ANNs (Fig. 6). Therefore, it can be concluded that ANNs with provided GCM data were incapable of making predictions with a distribution matching the distributions of the observed data. In addition, a similar CC value (around 0.6) is obtained for precipitation downscaling in recent studies (Wen et al. 2023; Yang et al. 2023). CC values around 0.6 and NSE values around 0.4 showed that generally, statistical downscaling models are capable of detecting precipitation events but are incapable of accurately predicting extreme events. The abovementioned findings underline the demand for an uncertainty quantification and analysis for statistical downscaling of the precipitation.
Comparing historical observations and RCM outputs in the nearest grid point resulted in −0.51 and −1.02 NSE values for the precipitation parameter in Tabriz and Rasht, respectively. Moreover, RCM outputs for temperature showed −0.02 and 0.41 NSE values for the Tabriz and the Rasht stations, respectively. Consequently, RCM outputs are not capable of accurately predicting extreme events. Moreover, despite very low NSE values, RCM outputs and historical observation of temperature in Tabriz resulted in a linear correlation of 0.6. Therefore, a bias in RCM output and the observed values is obvious, which probably stems from the bias of the mother GCM. In addition, the statistical downscaling resulted in higher accuracy than the RCM outputs, which suggests that statistical downscaling of RCM to synoptic station scale is more informative than RCM grid points for the study region.
c. Results of prediction interval estimation
The hyperparameters determined for the point prediction were then employed to estimate PIs. The LUBE method was applied here based on SA and therefore, a set of hyperparameters, including the initial temperature and η, were also determined via a trial-and-error process. As also mentioned by Nourani et al. (2022), LSTM-based LUBE was capable of handling higher initial temperature than FFNN-based LUBE. However, because of the higher number of parameters and the complexity of the structure of the network, LSTM-based LUBE got hung up after 100 iterations; thus, the maximum number of iterations for LSTM-based LUBE was limited to 100, while the maximum number of iterations was set to 200 and 300 for precipitation and temperature modeling, respectively. Moreover, due to the pitfall of equal lower and upper bounds (which results in near 0 NMPIW), a higher η (50 and 70) was assigned for FFNN-based models. On the other hand, LSTM-based LUBE models were less susceptible to such pitfalls, and thus, η was set to 5 in LSTM-based LUBE models.
Moreover, in the optimization procedure via CWC, two target (nominal) confidence levels of 80% (μ = 0.8 in CWC) and 90% were used and compared to provide decision-makers two different confidence options (levels), which may help economical optimizations (e.g., drainage systems are usually designed for different return periods of precipitation). Moreover, as discussed earlier, LUBE is a distribution-free PI method. Consequently, higher confidence levels will result in wider PI, but to find each confidence level, different models must be trained. Therefore, two confidence levels were employed in this study, which may help to assess the characteristics of PIs estimated by the LUBE for different confidence levels.
Table 4 shows the obtained results for confidence levels of 80% and 90%. All models successfully achieved the nominal confidence levels by resulting in PICP values higher than μ in the validation subset, and NMPIW values determined the CWC values. Therefore, it is concluded that the LUBE method is capable of estimating PIs with the demanded confidence level, while different confidence levels can result in different PI widths. LSTM-based LUBE shows higher skill when compared with the FFNN-based LUBE by resulting in lower CWC in all cases (Table 4). The LSTM structure contains more parameters, and the connection of parameters is designed for time series modeling. As a result, more degrees of freedom are provided to the SA for finding the optimal solution. Therefore, LSTM-based LUBE could achieve narrower PIs with the same confidence level. The hidden and cell states (recurrent states) in LSTM-based LUBE were fixed after the training step due to the difference between the penalty functions of LSTM and LSTM-based LUBE. However, because of the concatenation of inputs and recurrent states, LSTM-based LUBE was able to include the impact of previous time steps in the predictions. On the other hand, FFNN-based LUBE (similar to FFNN) could not utilize the autoregressive characteristics of the data. Consequently, it is perceived that the detection of autoregressive characteristics of the predictands by LSTM cell and the gated structure could improve the accuracy of the downscaling model and reduce the uncertainty in the modeling.
Results of PIs estimation for both case studies and both climatic parameters with nominal confidence levels of 80% and 90%.
Similar to the point prediction results, estimated PIs for temperature downscaling showed higher performance with lower uncertainty in comparison with the precipitation, with 15% lower CWC on average. Both models with confidence levels were capable of catching extrema in downscaling temperatures for both stations, which allowed the models to estimate narrow PIs (Figs. 7 and 8). However, due to the high uncertainty and stochastic characteristics of the precipitation data, covering extreme events was a challenging task. Thus, with a higher confidence level, more peak values could be covered in precipitation downscaling. In addition, due to the natural limitation of the precipitation data to the zero value, most of the challenge for the models was to predict accurate upper bounds.
In addition, both the point prediction and the PI methods yielded satisfactory results in estimating precipitation in dry seasons. However, point predictions mostly resulted in large errors in the modeling of the wet season, which consists of peak precipitation values. It is worth mentioning that the wet season is from October to May in Tabriz and from September to March in Rasht. Although the wet season in different regions occurs in different months, the uncertainty in the estimated PIs of the wet seasons increased meaningfully for both cities. This increase in uncertainty can be observed from widening the PIs while maintaining the coverage rate (Figs. 7 and 8). Moreover, estimated PIs covered most of the point predictions as well as observations, which shows that PIs can cover alternative possible scenarios (e.g., the point predictions), especially where the point predictions and observed values did not match well. Consequently, while a point prediction with a large error could mislead decision-makers, proposed PIs could inform the decision-makers about possible extreme events. Moreover, results indicate that the proposed methods performed similarly in terms of CWC for precipitation downscaling at both stations. Although the two precipitation time series have different probability distributions, and both were not stationary time series, the distribution-free approach of the LUBE method maintained the performance of quantifying the uncertainty of the statistical downscaling in both cases.
PIs shown in Figs. 8 and 9 suggest that trained ANNs predict similar PIs in the same months of different years. Because of the unchanging structure of the ANNs in the validation step, where an ANN works like a function, and for similar set of inputs, an ANN predicts similar outputs. GCM outputs mostly have signal characteristics with a very slow changing average. Hence, in consecutive years, GCM outputs are observed to be very similar. Thus, estimated values by ANNs are also very similar in consecutive years. However, in long lead times where the average of the input values changed significantly, estimated PIs are observed to have significant change (Figs. 10 and 11).
Statistical assessments can provide very useful information for decision-makers, such as the return period of extreme events and the current general trends of climate variables. However, predictions for the future may not follow previous trends, due to the essential changes in anthropogenic activities like emission rates. Therefore, extrapolating a valid trend for the past may not be an appropriate approach. However, future simulations of GCMs make it possible to assess the outcomes of different scenarios. Therefore, statistical downscaling is one of the advantageous approaches in climate research for future projections of climate variables.
d. Results of projection for future
The projections for the near future (2046–50) and far future (2096–2100) were conducted resulting from the SSP5-8.5 scenario. LSTM and LSTM-based LUBE models were utilized as the most efficient models for point prediction and PIs projections, respectively. The projected precipitation and temperature were then compared with the observed values of the base period (i.e., 2006–10 for Tabriz station and 2008–12 for Rasht station) and the nearest Canadian Regional Climate Model 4 (Can-RCM4) grid point projected data for the same periods, as pseudo-observations (for pseud-observations, see Vrac et al. 2007).
Point prediction projections of precipitation for Tabriz show no significant change in average precipitation in the near future but a decrease of 20% for the far future with respect to the base period. Moreover, point predictions indicate that although there will be no specific change in precipitation in the wet season for Tabriz, precipitation rates may decrease significantly in the dry season (Fig. 9a). However, RCM projections show 20% and 45% increases in precipitation, respectively, for the near future and far future. Despite this contradiction, the estimated values for upper bounds of PI indicate 150% and 95% of the upper limit for the increase in average precipitation until near and far futures, respectively, with a 90% confidence level (Fig. 10a).
Contradictorily, for the Rasht station, point predictions show 25% and 70% increases in precipitation for the near and far futures, respectively, which is mostly related to the increase in the wet season (Fig. 9b). The RCM projections indicate a decrease in precipitation for both the near and far futures. The upper bounds of the PI projections suggest an upper limit of 40% and 150% more precipitation with a confidence level of 80% (Fig. 10b). Moreover, the lower bound of the estimated PIs for the precipitation in Rasht with both confidence levels shows an increasing pattern, which indicates the possibility of nonstop and heavy rainfall events.
While consecutive heavy rainfalls could result in disastrous outcomes such as submerging cities and a large number of fatalities, continuous rainfall for a long period could make victims even more vulnerable by damaging more properties and preventing recovery (Yusmah et al. 2020). Furthermore, continuous rainfall could also result in groundwater floods as more water permeates underground gradually (Adedeji et al. 2019). In addition, sustained precipitation can affect subways and metro ridership in a negative way (Zhou et al. 2021). According to the abovementioned results and concerns, development-plan designers are advised to prepare water resilience systems for nonstop precipitations and plan on utilizing this sustained precipitation in the region of Rasht city.
Point predictions of the statistical downscaling approach and dynamical projections both agree on an increase of average temperature in Tabriz. However, point predictions suggest an increase in average temperature of 2° and 5°C for the near and far futures, respectively, while RCM projections show a similar mean temperature for the near future but a 2°C increase of temperature for the faraway future. Moreover, lower bounds of the estimated PIs indicate that there will not be any more months with an average temperature under 0°C, especially in the far future at Tabriz. Therefore, alteration of the hydrological and hydroclimatic cycle and balance is expected to be experienced in the region in the future (Fig. 11a).
Point predictions show that the temperature in Rasht will increase by 1°C in the near and far future, while RCM projections show a 2°C increase in average temperature in the near future and a 5°C increase in the far future. However, PIs with a 90% confidence level are covering both the statistical and dynamical downscaling projection values for the average temperature in the Rasht station (Fig. 11b).
A comparison of RCM projections showed that point prediction projections and RCM are not matching, particularly for precipitation projections, and not-matching projected values indicate noncredible projections (at least one must fail). While point predictions suggest that the precipitation in Tabriz will decrease and maintain the previous distribution, RCM projections indicate that the mean precipitation will increase in the future and the precipitation events will have a wider distribution (more extreme events). In Rasht, point predictions show an increasing trend with a wider range, while the RCM suggests lower precipitation with a denser distribution. Despite the contradiction between RCM projections and point predictions, projected PIs were capable of covering both point projection and RCM values, which shows the capability of the model to address the uncertainties involved in future projections. In addition to covering both point predicted projections and RCM outputs, the distance between the mean of the upper and lower bounds did not increase in PI projections for the future. Thus, despite overcoming the contradiction between point predictions and RCM outputs, the PI’s width was not increased and could still provide useful and reliable intervals for decision-makers.
4. Conclusions
In this study, the accuracy and uncertainty of ANN-based statistical downscaling of precipitation and temperature were investigated. LSTM and FFNN were used as point prediction models, and the LUBE method was used for uncertainty assessment of the models. Precipitation and temperature data of Tabriz and Rasht synoptic stations in Iran were used as the targets for the downscaling. Predictor screening for the downscaling was done via MI and K-means clustering methods. DL-based LSTM was observed to outperform FFNN by leading to an average of 55% higher NSE. The proposed LUBE method via the LSTM network showed more reliable results than LUBE modeling via classic FFNN, resulting in overall 25% lower CWCs for uncertainty qualification of the hydroclimatic parameters downscaling.
Furthermore, it was concluded from the obtained results that PIs estimated with a 90% confidence level are 35% wider than PIs estimated with an 80% confidence level. After comparing the accuracy of the models, the superior models for both point prediction and PI estimation tasks were used for future projections of the temperature and precipitation for both stations. In this way, two periods, 2046–50 and 2096–2100, were considered as near and far future periods and a base period for each station was considered from the validation subset of data for comparison. The achieved results illustrated that Tabriz station is going to face droughts in the dry season, especially in the far future, accompanied by warmer winters and drastic changes in the climate regimes, such as no more expected snowfall. These results warn the planners and decision-makers to design advanced plans in an adaptable way with the mentioned changes. It is suggested to assess the possible ways to catch and store storm water, such as with artificial lakes and constructing groundwater recharging zones (Thomas 1987). Moreover, since the decrease in precipitation increases the stress on groundwater resources, it is suggested to develop sustainable groundwater consumption plans. Although the magnitude of the extreme precipitation is expected to be decreased in the region of the Rasht station, more sustaining precipitation with lower intensity is also expected, particularly in the far future. Sustaining rainfalls can result in raising groundwater levels and groundwater floods. Therefore, it is suggested for decision-makers and planners to include groundwater flood prevention systems in urban development plans. Furthermore, precipitation and temperature projections suggest that a more uniform climate pattern is expected in the region of the Rasht station by showing the effects of similar climates in different seasons in the results.
Since this study was based on a single GCM and a single future scenario, it is suggested for future studies to use a multi-GCM and scenario ensemble approach with an LSTM-based LUBE method to cover more uncertainty sources. Also, it is suggested to downscale GCM data with the methods employed in this study to model or downscale other hydroclimatologic parameters. Also, applying different data-preprocessing methods, such as wavelet linked to the mentioned methods, is suggested to enhance the accuracy of the results. In addition, LSTM-based LUBE has been shown to perform reliably in two different regions. Since only GCM outputs and observed synoptic scale data are provided to the model, there is still a need for training different models for different regions; a single trained model is not applicable for different regions at the moment. However, to enhance the generalizability of the model and include physical knowledge in the model, employing topographical maps and distance of the grid points from the synoptic station via image processing can also be explored in future research. Ultimately, data-driven models are highly sensitive to the amount of data used, and in cases of small datasets, models are susceptible to outliers and anomalies in the data. Therefore, a larger dataset can reduce the risk of being affected by low-quality data. Consequently, employing daily GCM data and synoptic station observations in future research can increase the reliability of the LSTM-based LUBE model for uncertainty quantification of statistical downscaling.
Acknowledgments.
This work was supported by the core-to-core project, the Japan Society for the Promotion of Science (JSPS) (Grant JPJSCCB202200044), and the Disaster Prevention Research Institute (DPRI) internal research fund (Grant 2022L-04).
Data availability statement.
The data supporting the findings of this study are available from the corresponding author upon reasonable request.
REFERENCES
Adedeji, T., D. Proverbs, H. Xiao, P. Cobbing, and V. Oladokun, 2019: Making Birmingham a flood resilient city: Challenges and opportunities. Water, 11, 1699, https://doi.org/10.3390/w11081699.
Aghabozorgi, S., A. S. Shirkhorshidi, and T. Y. Wah, 2015: Time-series clustering—A decade review. Inf. Syst., 53, 16–38, https://doi.org/10.1016/j.is.2015.04.007.
Almazroui, M., S. Saeed, F. Saeed, M. N. Islam, and M. Ismail, 2020: Projections of precipitation and temperature over the South Asian countries in CMIP6. Earth Syst. Environ., 4, 297–320, https://doi.org/10.1007/s41748-020-00157-7.
Baño-Medina, J., R. Manzanas, and J. M. Gutiérrez, 2020: Configuration and intercomparison of deep learning neural models for statistical downscaling. Geosci. Model Dev., 13, 2109–2124, https://doi.org/10.5194/gmd-13-2109-2020.
de Veaux, R. D., J. Schumi, J. Schweinsberg, and L. H. Ungar, 1998: Prediction intervals for neural networks via nonlinear regression. Technometrics, 40, 273–282, https://doi.org/10.2307/1270528.
Efron, B., 1979: Bootstrap methods: Another look at the jackknife. Ann. Stat., 7, 1–26, https://doi.org/10.1214/aos/1176344552.
Ehteram, M., S. F. Mousavi, H. Karami, S. Farzin, V. P. Singh, K.-w. Chau, and A. El-Shafie, 2018: Reservoir operation based on evolutionary algorithms and multi-criteria decision-making under climate change and uncertainty. J. Hydroinf., 20, 332–355, https://doi.org/10.2166/hydro.2018.094.
Halkidi, M., Y. Batistakis, and M. Vazirgiannis, 2001: On clustering validation techniques. J. Intell. Inf. Syst., 17, 107–145, https://doi.org/10.1023/A:1012801612483.
Hochreiter, S., and J. Schmidhuber, 1997: Long short-term memory. Neural Comput., 9, 1735–1780, https://doi.org/10.1162/neco.1997.9.8.1735.
Hosseini Baghanam, A., V. Nourani, M.-A. Keynejad, H. Taghipour, and M.-T. Alami, 2019: Conjunction of wavelet-entropy and SOM clustering for multi-GCM statistical downscaling. Hydrol. Res., 50, 1–23, https://doi.org/10.2166/nh.2018.169.
Jiang, D., D. Hu, Z. Tian, and X. Lang, 2020: Differences between CMIP6 and CMIP5 models in simulating climate over China and the East Asian monsoon. Adv. Atmos. Sci., 37, 1102–1118, https://doi.org/10.1007/s00376-020-2034-y.
Kabir, H. M. D., A. Khosravi, M. Anwar Hosen, and S. Nahavandi, 2018: Neural network-based uncertainty quantification: A survey of methodologies and applications. IEEE Access, 6, 36 218–36 234, https://doi.org/10.1109/ACCESS.2018.2836917.
Karger, D. N., D. R. Schmatz, G. Dettling, and N. E. Zimmermann, 2020: High-resolution monthly precipitation and temperature time series from 2006 to 2100. Sci. Data, 7, 248, https://doi.org/10.1038/s41597-020-00587-y.
Khosravi, A., S. Nahavandi, D. Creighton, and A. F. Atiya, 2011: Lower upper bound estimation method for construction of neural network-based prediction intervals. IEEE Trans. Neural Networks, 22, 337–346, https://doi.org/10.1109/TNN.2010.2096824.
Kim, H. H., 1992: Urban heat island. Int. J. Remote Sens., 13, 2319–2336, https://doi.org/10.1080/01431169208904271.
Kirkpatrick, S., C. D. Gelatt, and M. P. Vecchi, 1983: Optimization by simulated annealing. Science, 220, 671–680, https://doi.org/10.1126/science.220.4598.671.
Legates, D. R., and G. J. McCabe Jr., 1999: Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour. Res., 35, 233–241, https://doi.org/10.1029/1998WR900018.
Li, X., and V. Babovic, 2019: Multi-site multivariate downscaling of global climate model outputs: An integrated framework combining quantile mapping, stochastic weather generator and empirical copula approaches. Climate Dyn., 52, 5775–5799, https://doi.org/10.1007/s00382-018-4480-0.
Lun, Y., L. Liu, R. Wang, and G. Huang, 2020: Optimization assessment of projection methods of climate change for discrepancies between North and South China. Water, 12, 3106, https://doi.org/10.3390/w12113106.
MacQueen, J. B., 1967: Some methods for classification and analysis of multivariate observations. Statistics, Vol. 1, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 281–297, https://digitalassets.lib.berkeley.edu/math/ucb/text/math_s5_v1_article-17.pdf.
Maraun, D., and M. Widmann, 2018: Perfect prognosis. Statistical Downscaling and Bias Correction for Climate Research, Cambridge University Press, 141–169.
Moriasi, D. N., J. G. Arnold, M. W. Van Liew, R. L. Bingner, R. D. Harmel, and T. L. Veith, 2007: Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE, 50, 885–900, https://doi.org/10.13031/2013.23153.
Nash, J. E., and J. V. Sutcliffe, 1970: River flow forecasting through conceptual models part I—A discussion of principles. J. Hydrol., 10, 282–290, https://doi.org/10.1016/0022-1694(70)90255-6.
Nix, D. A., and A. S. Weigend, 1994: Estimating the mean and variance of the target probability distribution. Proc. 1994 IEEE Int. Conf. on Neural Networks (ICNN’94), Orlando, FL, IEEE, 55–60, https://doi.org/10.1109/ICNN.1994.374138.
Nourani, V., A. H. Baghanam, and H. Gokcekus, 2018: Data-driven ensemble model to statistically downscale rainfall using nonlinear predictor screening approach. J. Hydrol., 565, 538–551, https://doi.org/10.1016/j.jhydrol.2018.08.049.
Nourani, V., N. J. Paknezhad, E. Sharghi, and A. Khosravi, 2019: Estimation of prediction interval in ANN-based multi-GCMs downscaling of hydro-climatologic parameters. J. Hydrol., 579, 124226, https://doi.org/10.1016/j.jhydrol.2019.124226.
Nourani, V., K. Khodkar, and M. Gebremichael, 2022: Uncertainty assessment of LSTM based groundwater level predictions. Hydrol. Sci. J., 67, 773–790, https://doi.org/10.1080/02626667.2022.2046755.
Nourani, V., A. H. Ghareh Tapeh, K. Khodkar, and J. J. Huang, 2023: Assessing long-term climate change impact on spatiotemporal changes of groundwater level using autoregressive-based and ensemble machine learning models. J. Environ. Manage., 336, 117653, https://doi.org/10.1016/j.jenvman.2023.117653.
Raju, K. S., and D. N. Kumar, 2020: Review of approaches for selection and ensembling of GCMs. J. Water Climate Change, 11, 577–599, https://doi.org/10.2166/wcc.2020.128.
Riahi, K., and Coauthors, 2017: The shared socioeconomic pathways and their energy, land use, and greenhouse gas emissions implications: An overview. Global Environ. Change, 42, 153–168, https://doi.org/10.1016/j.gloenvcha.2016.05.009.
Sha, Y., D. Gange II, G. West, and R. Stull, 2020: Deep-learning-based gridded downscaling of surface meteorological variables in complex terrain. Part II: Daily precipitation. J. Appl. Meteor. Climatol., 59, 2075–2092, https://doi.org/10.1175/JAMC-D-20-0058.1.
Shamshirband, S., E. Jafari Nodoushan, J. E. Adolf, A. Abdul Manaf, A. Mosavi, and K.-w. Chau, 2019: Ensemble models with uncertainty analysis for multi-day ahead forecasting of chlorophyll a concentration in coastal waters. Eng. Appl. Comput. Fluid Mech., 13, 91–101, https://doi.org/10.1080/19942060.2018.1553742.
Shannon, C. E., 1948: A mathematical theory of communications I and II. Bell Syst. Tech. J., 27, 379–423, https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.
Sit, M., B. Z. Demiray, Z. Xiang, G. J. Ewing, Y. Sermet, and I. Demir, 2020: A comprehensive review of deep learning applications in hydrology and water resources. Water Sci. Technol., 82, 2635–2670, https://doi.org/10.2166/wst.2020.369.
Sun, L., and Y. Lan, 2021: Statistical downscaling of daily temperature and precipitation over China using deep learning neural models: Localization and comparison with other methods. Int. J. Climatol., 41, 1128–1147, https://doi.org/10.1002/joc.6769.
Swart, N. C., and Coauthors, 2019: The Canadian Earth System Model version 5 (CanESM5.0.3). Geosci. Model Dev., 12, 4823–4873, https://doi.org/10.5194/gmd-12-4823-2019.
Tabari, H., S. M. Paz, D. Buekenhout, and P. Willems, 2021: Comparison of statistical downscaling methods for climate change impact analysis on precipitation-driven drought. Hydrol. Earth Syst. Sci., 25, 3493–3517, https://doi.org/10.5194/hess-25-3493-2021.
Taormina, R., and K.-W. Chau, 2015: ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS. Eng. Appl. Artif. Intell., 45, 429–440, https://doi.org/10.1016/j.engappai.2015.07.019.
Thomas, R. K., 1987: Ground water recharge for Oklahoma—An analysis of past and future methodology. Ground Water Quality and Agricultural Practices, CRC Press, 54–70.
Tokarska, K. B., M. B. Stolpe, S. Sippel, E. M. Fischer, C. J. Smith, F. Lehner, and R. Knutti, 2020: Past warming trend constrains future warming in CMIP6 models. Sci. Adv., 6, eaaz9549, https://doi.org/10.1126/sciadv.aaz9549.
Tran Anh, D., S. P. Van, T. D. Dang, and L. P. Hoang, 2019: Downscaling rainfall using deep learning long short‐term memory and feedforward neural network. Int. J. Climatol., 39, 4170–4188, https://doi.org/10.1002/joc.6066.
Vandal, T., E. Kodra, J. Dy, S. Ganguly, R. Nemani, and A. R. Ganguly, 2018: Quantifying uncertainty in discrete-continuous and skewed data with Bayesian deep learning. KDD’18: Proc. 24th ACM SIGKDD Int. Conf. on Knowledge Discovery & Data Mining, London, United Kingdom, Association for Computing Machinery, 2377–2386, https://dl.acm.org/doi/abs/10.1145/3219819.3219996.
Vrac, M., M. L. Stein, K. Hayhoe, and X.-Z. Liang, 2007: A general method for validating statistical downscaling methods under future climate change. Geophys. Res. Lett., 34, L18701, https://doi.org/10.1029/2007GL030295.
Wang, B., T. Li, Z. Yan, G. Zhang, and J. Lu, 2020: DeepPIPE: A distribution-free uncertainty quantification approach for time series forecasting. Neurocomputing, 397, 11–19, https://doi.org/10.1016/j.neucom.2020.01.111.
Wang, R., Q. Cheng, L. Liu, C. Yan, and G. Huang, 2019: Multi-model projections of climate change in different RCP scenarios in an arid inland region, northwest China. Water, 11, 347, https://doi.org/10.3390/w11020347.
Wen, Y., A. Yang, Y. Fan, B. Wang, and D. Scott, 2023: Stepwise cluster ensemble downscaling for drought projection under climate change. Int. J. Climatol., 43, 2318–2338, https://doi.org/10.1002/joc.7977.
Xiang, Z., and I. Demir, 2020: Distributed long-term hourly streamflow predictions using deep learning—A case study for state of Iowa. Environ. Modell. Software, 131, 104761, https://doi.org/10.1016/j.envsoft.2020.104761.
Xu, Z., Y. Han, and Z. Yang, 2019: Dynamical downscaling of regional climate: A review of methods and limitations. Sci. China Earth Sci., 62, 365–375, https://doi.org/10.1007/s11430-018-9261-5.
Yan, L., J. Feng, T. Hang, and Y. Zhu, 2021: Flow interval prediction based on deep residual network and lower and upper boundary estimation method. Appl. Soft Comput., 104, 107228, https://doi.org/10.1016/j.asoc.2021.107228.
Yang, D., S. Liu, Y. Hu, X. Liu, J. Xie, and L. Zhao, 2023: Predictor selection for CNN-based statistical downscaling of monthly precipitation. Adv. Atmos. Sci., 40, 1117–1131, https://doi.org/10.1007/s00376-022-2119-x.
Yang, H. H., S. Van Vuuren, S. Sharma, and H. Hermansky, 2000: Relevance of time–frequency features for phonetic and speaker-channel classification. Speech Commun., 31, 35–50, https://doi.org/10.1016/S0167-6393(00)00007-8.
Yusmah, M. Y. S., L. J. Bracken, Z. Sahdan, H. Norhaslina, M. D. Melasutra, A. Ghaffarianhoseini, S. Sumiliana, and A. S. Shereen Farisha, 2020: Understanding urban flood vulnerability and resilience: A case study of Kuantan, Pahang, Malaysia. Nat. Hazards, 101, 551–571, https://doi.org/10.1007/s11069-020-03885-1.
Zhou, Y., Z. Li, M. Yangyang, Z. Li, and M. Zhong, 2021: Analyzing spatio-temporal impacts of extreme rainfall events on metro ridership characteristics. Physica A, 577, 126053, https://doi.org/10.1016/j.physa.2021.126053.