1. Introduction
Many gridded meteorological datasets depend on measurements from station networks. However, stations often have missing data and short observation periods. Missing data are caused by factors such as instrumental malfunction, interrupted communication, and failure in quality control checks. Missing data limit the accuracy of gridded products and constrain the scope of climate analyses (Eischeid et al. 2000; Serrano-Notivoli et al. 2019).
Several methods have been developed and applied to infill and reconstruct gaps in station time series (Eischeid et al. 2000; Allen and DeGaetano 2001; Simolo et al. 2010; Woldesenbet et al. 2017; Beguería et al. 2019; Serrano-Notivoli et al. 2019; Tang et al. 2020; Pappas et al. 2014). As described by Tang et al. (2020), these methods can be grouped into four types: self-contained infilling, spatial interpolation/regression, quantile mapping, and machine learning. Serially complete datasets (SCDs) of daily/monthly precipitation and/or temperature have been developed based on different gap-filling methods in various regions, such as the western United States (Eischeid et al. 2000), northeast Spain (Vicente-Serrano et al. 2003), Andalusia in Spain, (Ramos-Calzado et al. 2008), Sicily in Italy (Di Piazza et al. 2011), the upper Blue Nile basin (Woldesenbet et al. 2017), and the Hawaiian Islands (Longman et al. 2018). Recently, Tang et al. (2020) produced a daily SCD of precipitation and temperature for North America (SCDNA) by combining estimates from 16 strategies (variants of quantile mapping, spatial interpolation, and machine learning).
Gridded meteorological datasets are typically developed by using existing SCDs or applying gap-filling methods prior to spatial interpolation. For example, Di Luzio et al. (2008) constructed a gridded daily precipitation and temperature dataset for the contiguous United States (CONUS) using an extended version of the SCD developed by Eischeid et al. (2000). Longman et al. (2019) produced a gridded rainfall and temperature dataset for the Hawaiian Islands using the SCD from Longman et al. (2018). Newman et al. (2015, 2019, 2020) used quantile mapping to fill the gaps in daily precipitation and temperature data, and then used the gap-filled data to produce probabilistic estimates of precipitation across the CONUS, the Hawaiian Islands, and Alaska and the Yukon. Serrano-Notivoli et al. (2019) built an SCD using a k-nearest-neighbors regression approach to create a gridded temperature dataset for Spain.
The advantages of using SCDs in spatial interpolation are scarcely studied. A notable exception is the study by Longman et al. (2020), which shows that gap filling can improve the quality of gridded estimation of monthly precipitation in Hawaii. From a practical perspective, SCDs provide complete station records with a longer time span than original station observations and thus increase the effective spatial density of the station network in any specific time interval. Since station densities affect the accuracy of the spatial interpolation (Hofstra et al. 2010), it seems reasonable to assume that SCDs can improve the accuracy of gridded precipitation and temperature estimates.
The accuracy of the gap-filled data in SCDs depends on target variables, infilling/reconstruction methods, topography, seasons, station densities, and other factors (Eischeid et al. 2000; Tang et al. 2020). Moreover, it has not been studied whether gridded estimates based on SCDs can offer reasonable estimates of trends. Henn et al. (2018) compare six gridded precipitation datasets (one adopts gap filling prior to interpolation) in the western United States, and found those datasets show different magnitude and sign of trends. We assume this may partly explain why some gridded climate datasets (e.g., Haylock et al. 2008; Livneh et al. 2015) only use raw observations and exclude stations with too many missing values and short observation periods. In summary, we still have a limited understanding of the value of SCDs in spatial interpolation.
Here we explore the extent to which gap filling improves gridded precipitation and temperature estimates. We use the SCDNA (Tang et al. 2020), which provides daily precipitation and temperature data from ~27 000 stations. Multiple experiments are designed to quantify the benefits of SCDNA to improve the statistical accuracy and trend estimation using various interpolation methods. The results provide insights into the value of SCDs in producing gridded meteorological datasets and help researchers decide whether gap filling is useful in their studies.
2. SCD dataset
Gap-filled precipitation and temperature data from 1979 to 2018 are obtained from SCDNA (Tang et al. 2020). The dataset has 24 615 precipitation, 19 604 daily minimum temperature, and 19 611 daily maximum temperature stations in North America (Fig. 1; open access on Zenodo: https://doi.org/10.5281/zenodo.3953310). SCDNA is built using station data from four databases: the Global Historical Climate Network-Daily (GHCN-D; Menne et al. 2012), the Global Surface Summary of the Day (GSOD), Environment and Climate Change Canada (ECCC), and the Mexico database from Servicio Meteorológico Nacional (Livneh et al. 2015). Raw station observations undergo strict quality control following the procedures applied in popular datasets (Beck et al. 2019; Durre et al. 2010; Hamada et al. 2011).
The densities of (a) precipitation stations and (b) temperature stations at the 0.5° × 0.5° resolution over North America.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
SCDNA is composed of raw station observations, infilled data (i.e., missing values during the station measurement period), and reconstructed data (i.e., estimates outside the station measurement period). SCDNA is a single dataset produced by combining estimates from 16 strategies based on quantile mapping, spatial interpolation (inverse distance weighting, normal ratio, and multiple linear regression), machine learning [artificial neural network and random forest (RF; Breiman 2001)], and multisource merging. Data from three reanalysis products are used to assist infilling/reconstruction; the reanalysis products are particularly helpful in remote regions. The production of SCDNA has nine steps: 1) extracting spatiotemporally concurrent reanalysis estimates with stations; 2) extracting neighboring stations for every station; 3) estimating empirical cumulative density functions; 4) infilling and reconstruction based on 16 strategies; 5) independent validation of filled estimates; 6) merging all strategies to generate single series of station data; 7) climatological bias correction; 8) comparison with benchmark datasets and replacing estimates by observations whenever possible; and 9) final quality control. Please refer to Tang et al. (2020) for more details.
SCDNA generally preserves the spatial correlation structure and the variance of time series. Independent validation shows that SCDNA precipitation and temperature estimates have high accuracy. The median values of the modified Kling–Gupta efficiency (KGE′; Gupta et al. 2009; Kling et al. 2012) are 0.90, 0.98, and 0.99 for precipitation, Tmin, and Tmax, respectively (Tang et al. 2020). The KGE′ values are calculated based on the final SCDNA estimates (i.e., after climatological bias correction) using the leave-one-out strategy. The details of KGE metrics are in section 3c.
3. Methodology
a. Interpolation methods
Interpolation methods affect the quality of gridded estimates. Simple statistical methods are often more sensitive to station densities and topography than more complex methods. Therefore, three categories of interpolation methods are used here, that is, statistical, machine learning, and knowledge-based methods.
The second category includes a machine learning method, that is, RF. Li et al. (2011) showed that the RF is effective for spatial interpolation of environmental variables (seabed mud content) and performs better than the support vector machine (SVM). Hengl et al. (2018) applied RF in the spatial prediction of different types of environmental datasets including rainfall. By using buffer distances to sampling points as explanatory variables, RF obtains equally accurate or even better estimates than various Kriging interpolation methods (Hengl et al. 2018). Other variables such as satellite precipitation datasets can also be used as explanatory variables of RF models (Baez-Villanueva et al. 2020). RF is also applied in interpolating temperature data (Appelhans et al. 2015; Webb et al. 2016). Here, the design of RF follows Tang et al. (2020), that is, we use latitude, longitude, and elevation as explanatory variables.
TIER uses a leave-one-out strategy to estimate the uncertainties of yj,b and β, separately, based on which the total uncertainties of yj can be obtained. For a grid j and its n neighboring stations, the leave-one-out strategy performs the interpolation n times by leaving out one neighboring station each time, and the standard deviation of the n estimates is used as uncertainty (TIER codes are available at https://doi.org/10.5281/zenodo.3234938).
b. Design of the experiments
Four experiments are used to evaluate the added value of gap filling prior to spatial interpolation. Experiment 1 evaluates the impact of station densities on the added value of gap filling. Experiment 2 evaluates the effectiveness of gap filling using various interpolation methods. Experiment 3 quantifies the value of SCDNA for spatial interpolation of precipitation and temperature across North America. Experiment 4 evaluates whether gridded estimates based on SCDNA can capture long-term climate trends. In this study, station data are interpolated to the locations of validation stations instead of regular grids to facilitate evaluation.
Experiments 1 and 2 are performed within a 10° × 10° area ranging between 100° and 110°W and between 30° and 40°N (Fig. 2) to reduce the computational cost. This region is selected as 1) its station density is higher than most regions in North America (Figs. 2a,b), enabling realizations of different station densities, and 2) its topography is complex with elevations ranging from 460 to 3900 m (Fig. 2c), making it challenging to obtain accurate gridded estimates. This region comprises 1514 precipitation and 990 temperature stations from SCDNA. The median KGE′ values of SCDNA are 0.86, 0.98, and 0.99 for precipitation, Tmin, and Tmax, respectively (Fig. S1 in the online supplemental material). Experiments 3 and 4 are performed over all of North America.
(a) The location of the experimental area in North America. (b) The locations of stations for precipitation and temperature (P and T), precipitation only (P only), and temperature only (T only).
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
Experiment 1 evaluates the impact of station density on the added value of gap filling. First, we define actual stations as those with observations for at least 99% of days within a specific year, virtual stations as those with filled estimates for at least 99% of days within a specific year, and unqualified stations as those with observations for more than 1% but less than 99% of days. Unqualified stations are excluded to avoid unclear attribution of observations and filled data in interpolated estimates.
Experiment 1 is implemented using IDW in 1984 to reduce computational cost. The year 1984 is selected due to its high ratio between virtual stations (stations with >99% gap filled data) and actual stations (stations with >99% observations). Actual and virtual stations show similar spatial distributions with higher precipitation and lower Tmin and Tmax in the northwest part due to the topographic effect (Fig. 3). The detailed steps are as follows:
Extract all actual and virtual stations for 1984. For precipitation, there are 433 actual stations and 829 virtual stations. For Tmax, the numbers are 235 and 447, respectively. For Tmin, the numbers are 192 and 443, respectively. The difference between Tmin and Tmax is caused by the exclusion of unqualified stations.
For precipitation, we generate networks composed of N1 (20, 40, 60, …, 420) actual stations and N2 (0, 20, 40, 60, …, 820) virtual stations by randomly sampling from stations in the first step, which results in 882 networks. The remaining actual stations (433–N1) are used for validation. For Tmin and Tmax, the procedures are the same, but N1 and N2 increase with a step of 10 due to their smaller numbers of stations.
Precipitation, Tmin, and Tmax data are interpolated from the generated networks to the locations of validation stations. Then, for every N1, the accuracy of networks with virtual stations (N2 > 0) is compared to the accuracy of networks without virtual stations (N2 = 0) to quantify the added value of gap filling for different station densities.
The second and third steps are repeated 100 times to reduce the uncertainties caused by random sampling (i.e., random deployment of stations). The median values of accuracy metrics are used for each combination of N1 and N2.
Spatial distribution of precipitation and temperature in 1984 in the experimental area based on (top) actual and (bottom) virtual stations.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
Experiment 2 evaluates the effectiveness of gap filling using various interpolation methods. Experiment 2 follows the definition of actual and virtual stations in experiment 1 and is implemented for every year from 1979 to 2018 using various interpolation methods (nearest neighbor, bilinear interpolation, IDW, RF, and TIER). The detailed steps are as follows:
Extract actual and virtual stations for every year from 1979 to 2018. The number of virtual stations is larger than that of actual stations before 2005 for precipitation and 1996 for Tmin and Tmax (Fig. S2).
For every year, tenfold cross validation is used to evaluate the accuracy of interpolated precipitation and temperature data. This is realized by (i) randomly dividing the actual stations into 10 equal segments; (ii) using one segment as validation stations (10% of all actual stations), and then obtaining interpolated estimates for validation stations based on two station networks, that is, the remaining actual stations and actual plus virtual stations; (iii) repeating step ii 10 times for each of the segments in step i to get interpolated estimates for all actual stations.
For every year, compare the performance of interpolated estimates with and without virtual stations. Particularly, the effect of including virtual stations on the uncertainty of spatial interpolation is analyzed using TIER because it directly provides uncertainty estimates. TIER treats all stations equally, and thus the results in section 4b show the difference between actual stations and actual plus virtual stations.
Experiment 3 quantifies the value of SCDNA for spatial interpolation of precipitation and temperature across North America. Experiment 3 uses all stations from SCDNA to demonstrate whether SCDNA can improve the statistical accuracy of gridded estimates in North America. The detailed steps are as follows:
Observation-based interpolation. For each target station and each day from 1979 to 2018, 10 neighboring stations with actual observations are selected to obtain interpolated estimates at the target point using IDW. Note that, since actual observations are used, the locations of the 10 neighboring stations could change during the study period.
SCDNA-based interpolation. The interpolation procedure is the same as the previous step, while the locations of neighboring stations do not change with time since their time series are complete because of gap filling.
The interpolation results from steps 1 and 2 are compared in North America from the spatial and temporal perspectives.
Experiment 4 evaluates whether gridded estimates based on SCDNA can capture long-term climate trends. Experiment 4 resembles experiment 3 with two differences: 1) only stations with at least 20-yr observations are included in interpolation to achieve better trend estimation, and 2) only stations with at least 35-yr observations are included in validation to ensure the reliability of long-term trends. Linear trends from observation-based interpolation and SCDNA-based interpolation are compared using reference trends from validation stations.
c. Accuracy indicators
The mean term in KGE″ is standardized by the variance of observations instead of the mean value of observations in KGE, making KGE″ more suitable for the assessment of precipitation in dry areas and temperature (in Celsius) that may have a mean value close to zero in some regions.
4. Results
a. The effect of station density on spatial interpolation
Figure 4 shows the metrics of interpolated precipitation for different combinations of station networks in experiment 1. As the number of actual and virtual stations increases, KGE″ and CC values increase and RMSE values decrease, indicating higher accuracy of the spatial interpolation. The marginal benefits of increasing station numbers diminish as the station density increases. For example, KGE″ and CC change very little as the number of virtual stations increases from 400 to 800. This is anticipated as additional stations beyond a specific point do not add more information due to strong spatial correlation. According to the differences of metrics between positive virtual stations and no virtual station, gap filling almost always contributes to improved accuracy of interpolated precipitation. The improvement is more obvious when the numbers of actual stations are low. Tmin shows similar patterns with precipitation (Fig. S3).
(a) KGE, (c) CC, and (e) RMSE of interpolated precipitation estimates in 1984 using IDW for different combinations of actual and virtual stations. (b),(d),(f) The metric differences between positive virtual stations and no virtual station.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
Tmax shows a different pattern compared to precipitation and Tmin (Fig. 5). When the number of actual stations is high, the increasing number of virtual stations results in slightly lower KGE″ and CC and higher RMSE. The CC values decrease with increased actual stations when the number of virtual stations is larger than 300. A possible reason is that Fig. 5 uses data in 1984 when the virtual stations are much more than actual stations (Fig. S2), which results in slightly worse SCDNA estimates at virtual stations (Fig. S4) because gap filling relies on observations from actual stations. We did the same analysis for Tmax in 2000, which shows that increasing virtual stations contribute to increased KGE″ and decreased RMSE (Fig. S5). However, this cannot explain why Tmin, with a similar number of actual and virtual stations, always benefits from increased virtual stations.
As in Fig. 4, but for Tmax.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
Another possible reason is that spatial interpolation of Tmax can achieve higher accuracy than that of Tmin, which has been found by previous studies (Jarvis and Stuart 2001; Hutchinson et al. 2009). Tmin and Tmax reflect different conditions of the air column. Tmax occurs during the day and represents the temperature of the convective mixed layer mainly related to the net radiation, while Tmin occurs during the night and represents the temperature of the nocturnal boundary layer, which is thicker than the daytime mixed layer (Klotzbach et al. 2009). Local factors such as land-cover types have more impact on Tmin and its variability than Tmax (Pena-Angulo et al. 2015). The spatial correlation length of Tmin could also be shorter than that of Tmax (Pena-Angulo et al. 2015; Tang et al. 2020). Therefore, Tmax could be less demanding for station densities in spatial interpolation compared to Tmin. The increased density due to virtual stations cannot add much value to interpolated Tmax estimates. In contrast, the uncertainty of gap-filled estimates may degrade the quality of interpolated estimates.
b. Temporal performance and uncertainty of various interpolation methods
Experiment 2 compares different interpolation methods (TIER, RF, IDW, bilinear, and nearest neighbor). TIER and RF consider topographic information while the others do not. The interpolation based on actual plus virtual stations always performs better than that based only on actual stations according to all interpolation methods (Fig. 6 and Figs. S6–S8). For example, the median KGE″ values of precipitation based on RF are 0.50 and 0.56 before and after including virtual stations, respectively. The improvement is the most evident for precipitation, followed by Tmin. For Tmax, the improvement, although small in magnitude, still exists for all years. The improvement is larger in the early years than recent years because the number of actual stations increases from 1979 to 2018 (Fig. S2), which makes the benefits of virtual stations gradually diminish. Based on the results, we conclude that virtual stations have a larger contribution to spatial interpolation for simpler methods (e.g., nearest neighbor) than for more sophisticated methods such as RF.
KGE″, CC, and RMSE for interpolated precipitation, Tmin, and Tmax estimates from 1979 to 2018. Median values of all stations are used for every year. The interpolation method is RF.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
IDW and bilinear interpolation (Figs. S6 and S7) show that for the interpolation of Tmax, including virtual stations could worsen RMSE before 1997 but improve RMSE after 1997 because actual stations become more than virtual stations after 1997. This agrees with the findings in experiment 1.
Interpolation based on TIER is only implemented in 1984 because of the large computational cost. The improvement of including virtual stations is notable for precipitation particularly in the northeastern part of the experimental area where the topography is the most complex (Fig. S9). For Tmin and Tmax, the improvement is less significant. Nevertheless, the reduction of interpolation uncertainty is very substantial for all three variables (Fig. 7). See section 3a for the calculation of TIER uncertainties. About 90% of all stations show reduced uncertainties. The mean ratios of the reduction are −25%, −40%, and −46% for precipitation, Tmin, and Tmax, respectively. The largest reduction for Tmax is caused by its smallest number of actual stations (section 3b).
The reduction ratio of uncertainties after including virtual stations in TIER-based interpolation. UA and UAV are the uncertainty of interpolation based on actual stations and actual plus virtual stations; the ratio equals to (UAV − UA)/UA.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
The reduction in uncertainty in TIER is meaningful although it is not as straightforward as the improvement in accuracy metrics. Most interpolation methods produce “deterministic” estimates that, however, are not truly deterministic. The locations of stations, the strategies for filtering neighboring stations, and interpolation methods all affect interpolated estimates. The change of those conditions can result in a change of gridded estimates. The results based on TIER suggest that virtual stations (i.e., gap filling) can effectively improve the density of stations and thus reduce the uncertainties caused by those conditions.
c. Statistical accuracy of SCDNA-based interpolation over North America
The added value of gap filling over North America is explored in experiment 3. We examine interpolation based on SCDNA data (observations and filled data), referred to as INT-SCDNA, and interpolation based only on station observations, referred to as INT-OBS. The spatial comparison shows that INT-SCDNA achieves better accuracy metrics for most stations than INT-OBS (Fig. 8). For example, INT-SCDNA shows higher KGE″ than INT-OBS for 59%, 93%, and 81% of all precipitation, Tmin, and Tmax, stations, respectively. For precipitation, the improvement of KGE″, CC, and RMSE is larger in high latitudes and Mexico where the density of stations is relatively lower compared to the United States. For Tmin and Tmax, the improvement is more evident in Mexico where the variability of temperature is stronger than higher latitudes.
The spatial distributions and histograms of the differences of KGE″, CC, and RMSE between interpolation based on SCDNA (observations and filled data) and based only on observations (observations). The interpolation method is IDW. The period is from 1979 to 2018.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
The temporal comparison shows that INT-SCDNA is notably better than INT-OBS for all variables during the study period (Fig. 9). Note that the metrics of INT-SCDNA are always better than those of INT-OBS for all days in Fig. 9 due to the effect of moving averaging. Even so, the ratios of days that INT-SCDNA is better than INT-OBS before moving average are very large for most variables and metrics. For Tmin, the ratio is up to 99.8% and 99.7% for CC and RMSE, respectively. For all the three variables, KGE″ shows smaller ratios (76%–86%) than CC and RMSE. This is because filled data in SCDNA could show weaker variability (i.e., standard deviation) than raw observations in some cases, which results in weaker variability of interpolated estimates and thus a larger variability bias term in the formula of KGE″ [Eq. (6)]. This is an inevitable limitation of existing gap-filling methods that rely on raw station observations within the study area. However, the negative effect just slightly weakens the performance of SCDNA as a whole. Seasonal analysis (Fig. S10) shows that the improvement of INT-SCDNA against INT-OBS is larger in the warm season when the spatial variability is larger (i.e., shorter spatial correlation length) and interpolated estimates show degraded quality.
The temporal variations of metrics from interpolation based on SCDNA (observations and filled data) and based only on observations. The interpolation method is IDW. The period is from 1979 to 2018. A moving average with a window size of 365 is applied to smooth the highly variable daily curves. The ratios of days that SCDNA shows better metrics (higher CC and KGE″ and lower RMSE before moving average) than observations are shown in the bottom-left corner.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
The mean error (i.e., the average difference between estimates and observations) is also investigated. For precipitation, Tmin, and Tmax, the mean errors over North America are −0.01 mm day−1, 0.05°C, and 0.03°C for INT-OBS, respectively, and −0.07 mm day−1, 0.09°C, and 0.01°C for NT-SCDNA, respectively. The mean errors are generally small due to the offset of underestimated and overestimated values. Gap filling prior to spatial interpolation resulted in slightly larger mean errors for precipitation and Tmin, which, however, is not significant. For example, for precipitation, gap filling only results in greater mean errors for 52.1% of all stations. Meanwhile, the mean error is small compared to precipitation intensity (~2.4 mm day−1 over North America), and results have shown that gap filling improves the mean absolute error and RMSE, which do not suffer from the offset of negative and positive values.
Overall, the temporal comparison (Fig. 9) shows that gap filling prior to spatial interpolation of precipitation over North America results is 2.8%, 2.3%, and 2.5% improvement of CC, RMSE, and KGE″, respectively. For high-latitude regions (>65°N), the improvement of CC and KGE″ reaches 17.78% and 4.91%, while the improvement of RMSE is smaller than 1% partly due to the smaller precipitation intensity compared to mid-and low-latitude regions. For Tmin and Tmax, the improvements over high-latitude regions are ~5.1%, ~7.8%, and ~5.4% for CC, RMSE, and KGE″, respectively; over mid- and low-latitude regions, the improvement of RMSE is ~2.5%, while the improvement of CC and KGE″ is less than 1% because interpolated temperature estimates already show good performance. The spatial comparison also shows larger improvement in complex terrain and high-latitude regions (Fig. 8). For example, for precipitation, the improvement of CC due to gap filling prior to interpolation is 1.3%, 2.5%, and 11.78% for all of North America, complex topography with an elevation higher than 2000 m, and high-latitude regions northern to 65°N.
d. Trend of SCDNA-based interpolation over North America
SCDs often have a large portion of filled or reconstructed data. For example, the ratios of filled data in SCDNA are about 43% and 40% for precipitation and temperature (Tmin and Tmax), respectively. It is important to know whether gridded estimates based on SCDNA with both filled data and observations can correctly capture long-term trends. In this section, only stations with at least 35-yr observations are involved in the validation described in experiment 4. Linear trends are estimated based on annual precipitation and annual average temperature. The numbers of qualified stations are 5068, 3675, and 3724, for precipitation, Tmin, and Tmax, respectively.
According to observations, precipitation shows a decreasing trend in the western United States and part of the southern United States and an increasing trend in the eastern and northern United States (Fig. 10). Tmin and Tmax generally show an increasing trend in North America. INT-SCDNA and INT-OBS can both reproduce the spatial distributions of precipitation and temperature trends in North America, and INT-SCDNA shows closer trends to observations than INT-OBS. It is noted that INT-OBS has notable biases of Tmin and Tmax trends in the western United States. The trend based on INT-OBS is even inverse to the trend based on observations at some stations. We also investigated stations that pass the significance test (Fig. S11), which are fewer in number compared to Fig. 10 but show the same problem.
The distributions and histograms of linear trends based on (left) station observations (at least 35 years), (center) interpolation based on observations, and (right) interpolation based on SCDNA. The interpolation method is IDW.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
A typical station is analyzed to demonstrate why INT-OBS shows biased trends for Tmax in the western United States (Fig. 11). A certain number of neighboring stations are needed to obtain interpolated estimates at a point or grid. For INT-OBS, neighboring stations cannot remain the same during the whole period, unless the interpolation uses stations without any gap or missing values during the 40-yr period, which will result in a very small number of qualified stations. For this typical station, the locations of neighboring stations are quite variable before 1990 and experience a changepoint at the end of 1989 due to the installation of several new stations (Fig. 11a). The locations of neighboring stations also change at the beginning of 2010, although this change is not drastic. The two changepoints result in two notable changepoints of Tmax estimates from IDW-based INT-OBS in 1990 and 2010 (Fig. 11b), which affect the overall trend of INT-OBS from 1979 to 2018. To overcome the absence of topography information in IDW, we also utilize locally weighted linear regression, which uses latitude, longitude, and elevation as the predictors and Tmax as the predictands. The linear regression does improve the accuracy of interpolation and trend estimation but cannot remove the effect of the two changepoints (Fig. 11c). In contrast, INT-SCDNA shows a more consistent trend because SCDNA has no gap and thus the locations of neighboring stations never change during the study period.
For a target station (USC00264935: 42.00°N, 117.72°W, 1350 m), (a) the distance of 10 nearest neighboring stations to the target station for every day from 1979 to 2018, (b) annual mean of daily maximum temperature based on observations and IDW-based interpolation, and (c) annual mean of daily maximum temperature based on observations and locally weighted linear regression-based interpolation.
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
Increasing the time limit of stations used in interpolation (20 years in experiment 4) can reduce but cannot eliminate the effect of changing neighboring stations, which could be at the cost of decreased accuracy of spatial interpolation due to the decreased number of available stations. Using advanced interpolation methods (e.g., locally weighted linear regression) cannot completely solve this problem due to errors in the spatial interpolation. The effect of changing neighboring stations could be substantial in regions where variables show large spatial variation caused by factors such as topography, land cover, and climate. Temperature is greatly affected by elevation, and thus for stations located in mountainous regions, even a small location change (e.g., 2010 in Fig. 11a) could cause notable changepoints in interpolated estimates.
Overall, INT-SCDNA achieves better trend estimation than INT-OBS (Fig. 12). For precipitation, INT-SCDNA and INT-OBS show similar trends, particularly for stations that pass the significance test. For Tmin and Tmax, the agreement between INT-SCDNA and OBS is higher than that between INT-OBS and OBS. Compared to INT-OBS, INT-SCDNA shows much fewer points that have a large discrepancy with OBS in trend estimation (Fig. 12). Particularly, for Tmin and Tmax, INT-OBS shows trends between −0.1 and −0.05°C yr−1 for some stations where OBS shows trends between 0° and 0.05°C yr−1, which corresponds to the western CONUS in Fig. 10. INT-SCDNA generates reasonable trends for those stations. According to the statistics of mean trend values and CC between OBS and INT (Table 1), INT-SCDNA outperforms INT-OBS in most cases. For example, for Tmax, the mean trends of stations that pass the significance test are 0.032°, 0.020°, and 0.028°C yr−1 for OBS, INT-OBS, and INT-SCDNA, respectively. The CC between OBS and INT-OBS trends is 0.181, which increases to 0.431 for that between OBS and INT-SCDNA.
Scatterplots between the trends estimated from OBS, INT-OBS, and INT-SCDNA. Blue points represent all stations. Red points represent stations with a significance trend (p < 0.05).
Citation: Journal of Hydrometeorology 22, 6; 10.1175/JHM-D-20-0313.1
The mean value of trends estimated from observations and interpolated data, and the correlation between observed trend and interpolated trend. The * represents stations with significant trends (p < 0.05).
5. Summary and conclusions
This study investigated the value of gap filling prior to spatial interpolation using data from a serially complete dataset over North America, that is, the SCDNA.
Gap filling improves the accuracy of spatial interpolation according to three metrics used in this study (KGE″, CC, and RMSE). The improvement due to gap filling is larger for increased virtual stations (i.e., filled data) and decreased actual stations (i.e., raw observations) according to an experiment based on many network combinations in a 10° × 10° area in the Rocky Mountains located in the western United States (Fig. 2). The added value of gap filling is the largest for precipitation, followed by Tmin, and the least for Tmax, which is partly determined by the difficulty of obtaining accurate gridded estimates for the three variables.
Three types of interpolation methods, that is, three statistical methods (IDW, bilinear, nearest neighbor), one machine learning method (RF), and one knowledge-based method (TIER), are studied in the experimental area. Results show that gap filling improves the accuracy of interpolation for all methods. The improvement is larger for simpler methods such as nearest neighbor and smaller for more sophisticated methods that can consider physical information (e.g., topography). Besides, gap filling greatly reduces the uncertainties in spatial interpolation, which is more evident when the number of actual stations is smaller.
SCDNA-based interpolation (INT-SCDNA) shows higher accuracy than observation-based interpolation (INT-OBS) over North America from 1979 to 2018. The spatial comparison shows that the improvement is larger in regions with few stations (e.g., high latitudes) but smaller in regions with dense stations (e.g., in the contiguous United States). For example, the improvement of CC for precipitation estimates is 2.8% over North America and 17.78% for high-latitude regions (>65°N). The temporal comparison shows that SCDNA can improve accuracy metrics for most days in the study period. For CC and RMSE, the ratio of days that INT-SCDNA is better than INT-OBS is close to or larger than 90%. For KGE″, the ratio is lower because filled data in SCDNA could result in smoothed variability in some cases. In addition, the improvement is more obvious in the early years than recent years due to the increased number of stations.
INT-SCDNA using IDW achieves better trend estimation for precipitation and temperature than INT-OBS concerning both spatial distributions and statistical metrics. Particularly, INT-OBS shows an inverse trend for some stations in the western United States compared to observation-based trends because the locations of stations used in interpolation change during the study period, resulting in artificial changepoints in interpolated estimates. Adopting more advanced interpolation methods can relieve but cannot solve this problem because current methods cannot perfectly account for the spatiotemporal variability of meteorological variables. In contrast, INT-SCDNA avoids this problem due to the feature of serial completeness. Besides, statistics show that trends based on INT-SCDNA generally agree better with the true trends regarding the mean value and correlation coefficients. Gridded meteorological datasets could show quite different climate trends particularly in the complex terrain (e.g., Henn et al. 2018), and our results show that gap filling could be a promising choice in obtaining reasonable gridded trend estimates.
Overall, this study demonstrates that gap filling can contribute to the improvement of statistical accuracy and long-term trend estimation in spatial interpolation in North America. An important reason is the high quality of SCDNA due to the high station densities in North America and the well-designed gap-filling strategies (Tang et al. 2020). However, gap filling is not always effective. In regions where accurate spatial interpolation is challenging due to low station densities (e.g., Africa) and strong variability of topography and climate (e.g., Tibetan Plateau), it is expected that gap filling could have relatively low accuracy and thus may not adequately improve the accuracy of spatial interpolation (Eischeid et al. 2000; Tang et al. 2020). Therefore, although we recommend gap filling as an effective strategy based on SCDNA in North America, researchers should always check the quality of their gap-filling methods or serially complete datasets before carrying out spatial interpolation.
Acknowledgments
The study is funded by the Global Water Futures (GWF) program in Canada. SMP acknowledges the support of the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant RGPIN-2019-06894).
REFERENCES
Allen, R. J., and A. T. DeGaetano, 2001: Estimating missing daily temperature extremes using an optimized regression approach. Int. J. Climatol., 21, 1305–1319, https://doi.org/10.1002/joc.679.
Appelhans, T., E. Mwangomo, D. R. Hardy, A. Hemp, and T. Nauss, 2015: Evaluating machine learning approaches for the interpolation of monthly air temperature at Mt. Kilimanjaro, Tanzania. Spat. Stat., 14, 91–113, https://doi.org/10.1016/j.spasta.2015.05.008.
Baez-Villanueva, O. M., and Coauthors, 2020: RF-MEP: A novel random forest method for merging gridded precipitation products and ground-based measurements. Remote Sens. Environ., 239, 111606, https://doi.org/10.1016/j.rse.2019.111606.
Beck, H. E., E. F. Wood, M. Pan, C. K. Fisher, D. G. Miralles, A. I. J. M. van Dijk, T. R. McVicar, and R. F. Adler, 2019: MSWEP V2 global 3-hourly 0.1° precipitation: Methodology and quantitative assessment. Bull. Amer. Meteor. Soc., 100, 473–500, https://doi.org/10.1175/BAMS-D-17-0138.1.
Beguería, S., M. Tomas-Burguera, R. Serrano-Notivoli, D. Peña-Angulo, S. M. Vicente-Serrano, and J.-C. González-Hidalgo, 2019: Gap filling of monthly temperature data and its effect on climatic variability and trends. J. Climate, 32, 7797–7821, https://doi.org/10.1175/JCLI-D-19-0244.1.
Breiman, L., 2001: Random forests. Mach. Learn., 45, 5–32, https://doi.org/10.1023/A:1010933404324.
Daly, C., G. H. Taylor, W. P. Gibson, T. W. Parzybok, G. L. Johnson, and P. A. Pasteris, 2000: High-quality spatial climate data sets for the United States and beyond. Trans. ASAE, 43, 1957, https://doi.org/10.13031/2013.3101.
Daly, C., W. P. Gibson, G. H. Taylor, G. L. Johnson, and P. Pasteris, 2002: A knowledge-based approach to the statistical mapping of climate. Climate Res., 22, 99–113, https://doi.org/10.3354/cr022099.
Daly, C., J. W. Smith, J. I. Smith, and R. B. McKane, 2007: High-resolution spatial modeling of daily weather elements for a catchment in the Oregon Cascade Mountains, United States. J. Appl. Meteor. Climatol., 46, 1565–1586, https://doi.org/10.1175/JAM2548.1.
Daly, C., M. Halbleib, J. I. Smith, W. P. Gibson, M. K. Doggett, G. H. Taylor, J. Curtis, and P. P. Pasteris, 2008: Physiographically sensitive mapping of climatological temperature and precipitation across the conterminous United States. Int. J. Climatol., 28, 2031–2064, https://doi.org/10.1002/joc.1688.
Di Luzio, M., G. L. Johnson, C. Daly, J. K. Eischeid, and J. G. Arnold, 2008: Constructing retrospective gridded daily precipitation and temperature datasets for the conterminous United States. J. Appl. Meteor. Climatol., 47, 475–497, https://doi.org/10.1175/2007JAMC1356.1.
Di Piazza, A., F. L. Conti, L. V. Noto, F. Viola, and G. La Loggia, 2011: Comparative analysis of different techniques for spatial interpolation of rainfall data to create a serially complete monthly time series of precipitation for Sicily, Italy. Int. J. Appl. Earth Obs. Geoinf., 13, 396–408, https://doi.org/10.1016/j.jag.2011.01.005.
Durre, I., M. J. Menne, B. E. Gleason, T. G. Houston, and R. S. Vose, 2010: Comprehensive automated quality assurance of daily surface observations. J. Appl. Meteor. Climatol., 49, 1615–1633, https://doi.org/10.1175/2010JAMC2375.1.
Eischeid, J. K., P. A. Pasteris, H. F. Diaz, M. S. Plantico, and N. J. Lott, 2000: Creating a serially complete, national daily time series of temperature and precipitation for the western United States. J. Appl. Meteor., 39, 1580–1591, https://doi.org/10.1175/1520-0450(2000)039<1580:CASCND>2.0.CO;2.
Gupta, H. V., H. Kling, K. K. Yilmaz, and G. F. Martinez, 2009: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. J. Hydrol., 377, 80–91, https://doi.org/10.1016/j.jhydrol.2009.08.003.
Hamada, A., O. Arakawa, and A. Yatagai, 2011: An automated quality control method for daily rain-gauge data. Global Environ. Res., 15, 183–192.
Haylock, M. R., N. Hofstra, A. M. G. Klein Tank, E. J. Klok, P. D. Jones, and M. New, 2008: A European daily high-resolution gridded data set of surface temperature and precipitation for 1950–2006. J. Geophys. Res., 113, D20119, https://doi.org/10.1029/2008JD010201.
Hengl, T., M. Nussbaum, M. N. Wright, G. B. M. Heuvelink, and B. Gräler, 2018: Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ, 6, e5518, https://doi.org/10.7717/peerj.5518.
Henn, B., A. J. Newman, B. Livneh, C. Daly, and J. D. Lundquist, 2018: An assessment of differences in gridded precipitation datasets in complex terrain. J. Hydrol., 556, 1205–1219, https://doi.org/10.1016/j.jhydrol.2017.03.008.
Hofstra, N., M. New, and C. McSweeney, 2010: The influence of interpolation and station network density on the distributions and trends of climate variables in gridded daily data. Climate Dyn., 35, 841–858, https://doi.org/10.1007/s00382-009-0698-1.
Hutchinson, M. F., D. W. McKenney, K. Lawrence, J. H. Pedlar, R. F. Hopkinson, E. Milewska, and P. Papadopol, 2009: Development and testing of Canada-wide interpolated spatial models of daily minimum–maximum temperature and precipitation for 1961–2003. J. Appl. Meteor. Climatol., 48, 725–741, https://doi.org/10.1175/2008JAMC1979.1.
Jarvis, C. H., and N. Stuart, 2001: A comparison among strategies for interpolating maximum and minimum daily air temperatures. Part II: The interaction between number of guiding variables and the type of interpolation method. J. Appl. Meteor., 40, 1075–1084, https://doi.org/10.1175/1520-0450(2001)040<1075:ACASFI>2.0.CO;2.
Kling, H., M. Fuchs, and M. Paulin, 2012: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. J. Hydrol., 424–425, 264–277, https://doi.org/10.1016/j.jhydrol.2012.01.011.
Klotzbach, P. J., R. A. Pielke Sr., R. A. Pielke Jr., J. R. Christy, and R. T. McNider, 2009: An alternative explanation for differential temperature trends at the surface and in the lower troposphere. J. Geophys. Res., 114, D21102, https://doi.org/10.1029/2009JD011841.
Li, J., A. D. Heap, A. Potter, and J. J. Daniell, 2011: Application of machine learning methods to spatial interpolation of environmental variables. Environ. Modell. Software, 26, 1647–1659, https://doi.org/10.1016/j.envsoft.2011.07.004.
Livneh, B., T. J. Bohn, D. W. Pierce, F. Munoz-Arriola, B. Nijssen, R. Vose, D. R. Cayan, and L. Brekke, 2015: A spatially comprehensive, hydrometeorological data set for Mexico, the U.S., and southern Canada 1950–2013. Sci. Data, 2, 150042, https://doi.org/10.1038/sdata.2015.42.
Longman, R. J., and Coauthors, 2018: Compilation of climate data from heterogeneous networks across the Hawaiian Islands. Sci. Data, 5, 180012, https://doi.org/10.1038/sdata.2018.12.
Longman, R. J., and Coauthors, 2019: High-resolution gridded daily rainfall and temperature for the Hawaiian Islands (1990–2014). J. Hydrometeor., 20, 489–508, https://doi.org/10.1175/JHM-D-18-0112.1.
Longman, R. J., A. J. Newman, T. W. Giambelluca, and M. Lucas, 2020: Characterizing the uncertainty and assessing the value of gap-filled daily rainfall data in Hawaii. J. Appl. Meteor. Climatol., 59, 1261–1276, https://doi.org/10.1175/JAMC-D-20-0007.1.
Menne, M. J., I. Durre, R. S. Vose, B. E. Gleason, and T. G. Houston, 2012: An overview of the Global Historical Climatology Network-Daily database. J. Atmos. Oceanic Technol., 29, 897–910, https://doi.org/10.1175/JTECH-D-11-00103.1.
Newman, A. J., and M. P. Clark, 2020: TIER version 1.0: An open-source Topographically InformEd Regression (TIER) model to estimate spatial meteorological fields. Geosci. Model Dev., 13, 1827–1843, https://doi.org/10.5194/gmd-13-1827-2020.
Newman, A. J., and Coauthors, 2015: Gridded ensemble precipitation and temperature estimates for the contiguous United States. J. Hydrometeor., 16, 2481–2500, https://doi.org/10.1175/JHM-D-15-0026.1.
Newman, A. J., M. P. Clark, R. J. Longman, E. Gilleland, T. W. Giambelluca, and J. R. Arnold, 2019: Use of daily station observations to produce high-resolution gridded probabilistic precipitation and temperature time series for the Hawaiian Islands. J. Hydrometeor., 20, 509–529, https://doi.org/10.1175/JHM-D-18-0113.1.
Newman, A. J., M. P. Clark, A. W. Wood, and J. R. Arnold, 2020: Probabilistic spatial meteorological estimates for Alaska and the Yukon. J. Geophys. Res. Atmos., 125, e2020JD032696, https://doi.org/10.1029/2020JD032696.
Pappas, C., S. M. Papalexiou, and D. Koutsoyiannis, 2014: A quick gap filling of missing hydrometeorological data. J. Geophys. Res. Atmos., 119, 9290–9300, https://doi.org/10.1002/2014JD021633.
Pena-Angulo, D., N. Cortesi, M. Brunetti, and J. C. González-Hidalgo, 2015: Spatial variability of maximum and minimum monthly temperature in Spain during 1981–2010 evaluated by correlation decay distance (CDD). Theor. Appl. Climatol., 122, 35–45, https://doi.org/10.1007/s00704-014-1277-x.
Ramos-Calzado, P., J. Gómez-Camacho, F. Pérez-Bernal, and M. F. Pita-López, 2008: A novel approach to precipitation series completion in climatological datasets: Application to Andalusia. Int. J. Climatol., 28, 1525–1534, https://doi.org/10.1002/joc.1657.
Santos, L., G. Thirel, and C. Perrin, 2018: Technical note: Pitfalls in using log-transformed flows within the KGE criterion. Hydrol. Earth Syst. Sci., 22, 4583–4591, https://doi.org/10.5194/hess-22-4583-2018.
Serrano-Notivoli, R., S. Beguería, and M. de Luis, 2019: STEAD: A high-resolution daily gridded temperature dataset for Spain. Earth Syst. Sci. Data, 11, 1171–1188, https://doi.org/10.5194/essd-11-1171-2019.
Simolo, C., M. Brunetti, M. Maugeri, and T. Nanni, 2010: Improving estimation of missing values in daily precipitation series by a probability density function-preserving approach. Int. J. Climatol., 30, 1564–1576, https://doi.org/10.1002/joc.1992.
Tang, G., M. P. Clark, A. J. Newman, A. W. Wood, S. M. Papalexiou, V. Vionnet, and P. H. Whitfield, 2020: SCDNA: A serially complete precipitation and temperature dataset for North America from 1979 to 2018. Earth Syst. Sci. Data, 12, 2381–2409, https://doi.org/10.5194/essd-12-2381-2020.
Vicente-Serrano, S. M., M. A. Saz-Sanchez, and J. M. Cuadrat, 2003: Comparative analysis of interpolation methods in the middle Ebro Valley (Spain): Application to annual precipitation and temperature. Climate Res., 24, 161–180, https://doi.org/10.3354/cr024161.
Webb, M. A., A. Hall, D. Kidd, and B. Minansy, 2016: Local-scale spatial modelling for interpolating climatic temperature variables to predict agricultural plant suitability. Theor. Appl. Climatol., 124, 1145–1165, https://doi.org/10.1007/s00704-015-1461-7.
Woldesenbet, T. A., N. A. Elagib, L. Ribbe, and J. Heinrich, 2017: Gap filling and homogenization of climatological datasets in the headwater region of the Upper Blue Nile Basin, Ethiopia. Int. J. Climatol., 37, 2122–2140, https://doi.org/10.1002/joc.4839.