Search Results
You are looking at 1 - 10 of 41 items for
- Author or Editor: Martyn P. Clark x
- Refine by Access: All Content x
Abstract
A snow data assimilation study was undertaken in which real data were used to update a conceptual model, SNOW-17. The aim of this study is to improve the model’s estimate of snow water equivalent (SWE) by merging the uncertainties associated with meteorological forcing data and SWE observations within the model. This is done with a view to aiding the estimation of snowpack initial conditions for the ultimate objective of streamflow forecasting via a distributed hydrologic model. To provide a test of this methodology, the authors performed experiments at 53 stations in Colorado. In each case the situation of an unobserved location is mimicked, using the data at any given station only for validation; essentially, these are withholding experiments. Both ensembles of model forcing data and assimilated data were derived via interpolation and stochastic modeling of data from surrounding sources. Through a process of cross validation the error for the ensemble of model forcing data and assimilated observations is explicitly estimated. An ensemble square root Kalman filter is applied to perform assimilation on a 5-day cycle. Improvements in the resulting SWE are most evident during the early accumulation season and late melt period. However, the large temporal correlation inherent in a snowpack results in a less than optimal assimilation and the increased skill is marginal. Once this temporal persistence is removed from both model and assimilated observations during the update cycle, a result is produced that is, within the limits of available information, consistently superior to either the model or interpolated observations.
Abstract
A snow data assimilation study was undertaken in which real data were used to update a conceptual model, SNOW-17. The aim of this study is to improve the model’s estimate of snow water equivalent (SWE) by merging the uncertainties associated with meteorological forcing data and SWE observations within the model. This is done with a view to aiding the estimation of snowpack initial conditions for the ultimate objective of streamflow forecasting via a distributed hydrologic model. To provide a test of this methodology, the authors performed experiments at 53 stations in Colorado. In each case the situation of an unobserved location is mimicked, using the data at any given station only for validation; essentially, these are withholding experiments. Both ensembles of model forcing data and assimilated data were derived via interpolation and stochastic modeling of data from surrounding sources. Through a process of cross validation the error for the ensemble of model forcing data and assimilated observations is explicitly estimated. An ensemble square root Kalman filter is applied to perform assimilation on a 5-day cycle. Improvements in the resulting SWE are most evident during the early accumulation season and late melt period. However, the large temporal correlation inherent in a snowpack results in a less than optimal assimilation and the increased skill is marginal. Once this temporal persistence is removed from both model and assimilated observations during the update cycle, a result is produced that is, within the limits of available information, consistently superior to either the model or interpolated observations.
Abstract
This paper describes a flexible method to generate ensemble gridded fields of precipitation in complex terrain. The method is based on locally weighted regression, in which spatial attributes from station locations are used as explanatory variables to predict spatial variability in precipitation. For each time step, regression models are used to estimate the conditional cumulative distribution function (cdf) of precipitation at each grid cell (conditional on daily precipitation totals from a sparse station network), and ensembles are generated by using realizations from correlated random fields to extract values from the gridded precipitation cdfs. Daily high-resolution precipitation ensembles are generated for a 300 km × 300 km section of western Colorado (dx = 2 km) for the period 1980–2003. The ensemble precipitation grids reproduce the climatological precipitation gradients and observed spatial correlation structure. Probabilistic verification shows that the precipitation estimates are reliable, in the sense that there is close agreement between the frequency of occurrence of specific precipitation events in different probability categories and the probability that is estimated from the ensemble. The probabilistic estimates have good discrimination in the sense that the estimated probabilities differ significantly between cases when specific precipitation events occur and when they do not. The method may be improved by merging the gauge-based precipitation ensembles with remotely sensed precipitation estimates from ground-based radar and satellites, or with precipitation and wind fields from numerical weather prediction models. The stochastic modeling framework developed in this study is flexible and can easily accommodate additional modifications and improvements.
Abstract
This paper describes a flexible method to generate ensemble gridded fields of precipitation in complex terrain. The method is based on locally weighted regression, in which spatial attributes from station locations are used as explanatory variables to predict spatial variability in precipitation. For each time step, regression models are used to estimate the conditional cumulative distribution function (cdf) of precipitation at each grid cell (conditional on daily precipitation totals from a sparse station network), and ensembles are generated by using realizations from correlated random fields to extract values from the gridded precipitation cdfs. Daily high-resolution precipitation ensembles are generated for a 300 km × 300 km section of western Colorado (dx = 2 km) for the period 1980–2003. The ensemble precipitation grids reproduce the climatological precipitation gradients and observed spatial correlation structure. Probabilistic verification shows that the precipitation estimates are reliable, in the sense that there is close agreement between the frequency of occurrence of specific precipitation events in different probability categories and the probability that is estimated from the ensemble. The probabilistic estimates have good discrimination in the sense that the estimated probabilities differ significantly between cases when specific precipitation events occur and when they do not. The method may be improved by merging the gauge-based precipitation ensembles with remotely sensed precipitation estimates from ground-based radar and satellites, or with precipitation and wind fields from numerical weather prediction models. The stochastic modeling framework developed in this study is flexible and can easily accommodate additional modifications and improvements.
Abstract
The timing of snowmelt runoff (SMR) for 84 rivers in the western United States is examined to understand the character of SMR variability and the climate processes that may be driving changes in SMR timing. Results indicate that the timing of SMR for many rivers in the western United States has shifted to earlier in the snowmelt season. This shift occurred as a step change during the mid-1980s in conjunction with a step increase in spring and early-summer atmospheric pressures and temperatures over the western United States. The cause of the step change has not yet been determined.
Abstract
The timing of snowmelt runoff (SMR) for 84 rivers in the western United States is examined to understand the character of SMR variability and the climate processes that may be driving changes in SMR timing. Results indicate that the timing of SMR for many rivers in the western United States has shifted to earlier in the snowmelt season. This shift occurred as a step change during the mid-1980s in conjunction with a step increase in spring and early-summer atmospheric pressures and temperatures over the western United States. The cause of the step change has not yet been determined.
Abstract
This paper examines an archive containing over 40 years of 8-day atmospheric forecasts over the contiguous United States from the NCEP reanalysis project to assess the possibilities for using medium-range numerical weather prediction model output for predictions of streamflow. This analysis shows the biases in the NCEP forecasts to be quite extreme. In many regions, systematic precipitation biases exceed 100% of the mean, with temperature biases exceeding 3°C. In some locations, biases are even higher. The accuracy of NCEP precipitation and 2-m maximum temperature forecasts is computed by interpolating the NCEP model output for each forecast day to the location of each station in the NWS cooperative network and computing the correlation with station observations. Results show that the accuracy of the NCEP forecasts is rather low in many areas of the country. Most apparent is the generally low skill in precipitation forecasts (particularly in July) and low skill in temperature forecasts in the western United States, the eastern seaboard, and the southern tier of states. These results outline a clear need for additional processing of the NCEP Medium-Range Forecast Model (MRF) output before it is used for hydrologic predictions.
Techniques of model output statistics (MOS) are used in this paper to downscale the NCEP forecasts to station locations. Forecasted atmospheric variables (e.g., total column precipitable water, 2-m air temperature) are used as predictors in a forward screening multiple linear regression model to improve forecasts of precipitation and temperature for stations in the National Weather Service cooperative network. This procedure effectively removes all systematic biases in the raw NCEP precipitation and temperature forecasts. MOS guidance also results in substantial improvements in the accuracy of maximum and minimum temperature forecasts throughout the country. For precipitation, forecast improvements were less impressive. MOS guidance increases the accuracy of precipitation forecasts over the northeastern United States, but overall, the accuracy of MOS-based precipitation forecasts is slightly lower than the raw NCEP forecasts.
Four basins in the United States were chosen as case studies to evaluate the value of MRF output for predictions of streamflow. Streamflow forecasts using MRF output were generated for one rainfall-dominated basin (Alapaha River at Statenville, Georgia) and three snowmelt-dominated basins (Animas River at Durango, Colorado; East Fork of the Carson River near Gardnerville, Nevada; and Cle Elum River near Roslyn, Washington). Hydrologic model output forced with measured-station data were used as “truth” to focus attention on the hydrologic effects of errors in the MRF forecasts. Eight-day streamflow forecasts produced using the MOS-corrected MRF output as input (MOS) were compared with those produced using the climatic Ensemble Streamflow Prediction (ESP) technique. MOS-based streamflow forecasts showed increased skill in the snowmelt-dominated river basins, where daily variations in streamflow are strongly forced by temperature. In contrast, the skill of MOS forecasts in the rainfall-dominated basin (the Alapaha River) were equivalent to the skill of the ESP forecasts. Further improvements in streamflow forecasts require more accurate local-scale forecasts of precipitation and temperature, more accurate specification of basin initial conditions, and more accurate model simulations of streamflow.
Abstract
This paper examines an archive containing over 40 years of 8-day atmospheric forecasts over the contiguous United States from the NCEP reanalysis project to assess the possibilities for using medium-range numerical weather prediction model output for predictions of streamflow. This analysis shows the biases in the NCEP forecasts to be quite extreme. In many regions, systematic precipitation biases exceed 100% of the mean, with temperature biases exceeding 3°C. In some locations, biases are even higher. The accuracy of NCEP precipitation and 2-m maximum temperature forecasts is computed by interpolating the NCEP model output for each forecast day to the location of each station in the NWS cooperative network and computing the correlation with station observations. Results show that the accuracy of the NCEP forecasts is rather low in many areas of the country. Most apparent is the generally low skill in precipitation forecasts (particularly in July) and low skill in temperature forecasts in the western United States, the eastern seaboard, and the southern tier of states. These results outline a clear need for additional processing of the NCEP Medium-Range Forecast Model (MRF) output before it is used for hydrologic predictions.
Techniques of model output statistics (MOS) are used in this paper to downscale the NCEP forecasts to station locations. Forecasted atmospheric variables (e.g., total column precipitable water, 2-m air temperature) are used as predictors in a forward screening multiple linear regression model to improve forecasts of precipitation and temperature for stations in the National Weather Service cooperative network. This procedure effectively removes all systematic biases in the raw NCEP precipitation and temperature forecasts. MOS guidance also results in substantial improvements in the accuracy of maximum and minimum temperature forecasts throughout the country. For precipitation, forecast improvements were less impressive. MOS guidance increases the accuracy of precipitation forecasts over the northeastern United States, but overall, the accuracy of MOS-based precipitation forecasts is slightly lower than the raw NCEP forecasts.
Four basins in the United States were chosen as case studies to evaluate the value of MRF output for predictions of streamflow. Streamflow forecasts using MRF output were generated for one rainfall-dominated basin (Alapaha River at Statenville, Georgia) and three snowmelt-dominated basins (Animas River at Durango, Colorado; East Fork of the Carson River near Gardnerville, Nevada; and Cle Elum River near Roslyn, Washington). Hydrologic model output forced with measured-station data were used as “truth” to focus attention on the hydrologic effects of errors in the MRF forecasts. Eight-day streamflow forecasts produced using the MOS-corrected MRF output as input (MOS) were compared with those produced using the climatic Ensemble Streamflow Prediction (ESP) technique. MOS-based streamflow forecasts showed increased skill in the snowmelt-dominated river basins, where daily variations in streamflow are strongly forced by temperature. In contrast, the skill of MOS forecasts in the rainfall-dominated basin (the Alapaha River) were equivalent to the skill of the ESP forecasts. Further improvements in streamflow forecasts require more accurate local-scale forecasts of precipitation and temperature, more accurate specification of basin initial conditions, and more accurate model simulations of streamflow.
Abstract
At least four different modeling studies indicate that variability in snow cover over Asia may modulate atmospheric circulation over the North Pacific Ocean during winter. Here, satellite data on snow extent for east Asia for 1971–95 along with atmospheric fields from the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis are used to examine whether the circulation signals seen in model results are actually observed in nature. Anomalies in snow extent over east Asia exhibit a distinct lack of persistence. This suggests that understanding the effects of east Asian snow cover is more germane for short- to medium-range weather forecasting applications than for problems on longer timescales. While it is impossible to attribute cause and effect in the empirical study, analyses of composite fields demonstrate relationships between snow cover extremes and atmospheric circulation downstream remarkably similar to those identified in model results. Positive snow cover extremes in midwinter are associated with a small decrease in air temperatures over the transient snow regions, a stronger east Asian jet, and negative geopotential height anomalies over the North Pacific Ocean. Opposing responses are observed for negative snow cover extremes. Diagnosis of storm track feedbacks shows that the action of high-frequency eddies does not reinforce circulation anomalies in positive snow cover extremes. However, in negative snow cover extremes, there are significant decreases in high-frequency eddy activity over the central North Pacific Ocean, and a corresponding decrease in the mean cyclonic effect of these eddies on the geopotential tendency, contributing to observed positive height anomalies over the North Pacific Ocean. The circulation signals over the North Pacific Ocean are much more pronounced in midwinter (January–February) than in the transitional seasons (November–December and March–April).
Abstract
At least four different modeling studies indicate that variability in snow cover over Asia may modulate atmospheric circulation over the North Pacific Ocean during winter. Here, satellite data on snow extent for east Asia for 1971–95 along with atmospheric fields from the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis are used to examine whether the circulation signals seen in model results are actually observed in nature. Anomalies in snow extent over east Asia exhibit a distinct lack of persistence. This suggests that understanding the effects of east Asian snow cover is more germane for short- to medium-range weather forecasting applications than for problems on longer timescales. While it is impossible to attribute cause and effect in the empirical study, analyses of composite fields demonstrate relationships between snow cover extremes and atmospheric circulation downstream remarkably similar to those identified in model results. Positive snow cover extremes in midwinter are associated with a small decrease in air temperatures over the transient snow regions, a stronger east Asian jet, and negative geopotential height anomalies over the North Pacific Ocean. Opposing responses are observed for negative snow cover extremes. Diagnosis of storm track feedbacks shows that the action of high-frequency eddies does not reinforce circulation anomalies in positive snow cover extremes. However, in negative snow cover extremes, there are significant decreases in high-frequency eddy activity over the central North Pacific Ocean, and a corresponding decrease in the mean cyclonic effect of these eddies on the geopotential tendency, contributing to observed positive height anomalies over the North Pacific Ocean. The circulation signals over the North Pacific Ocean are much more pronounced in midwinter (January–February) than in the transitional seasons (November–December and March–April).
Abstract
This paper provides a detailed description of the relationship between spring snow mass in the mountain areas of the western United States and summertime precipitation in the southwestern United States associated with the North American monsoon system and examines the hypothesis that antecedent spring snow mass can modulate monsoon rains through effects on land surface energy balance. Analysis of spring snow water equivalent (SWE) and July–August (JA) precipitation for the period of 1948–97 confirms the inverse snow–monsoon relationship noted in previous studies. Examination of regional difference in SWE–JA precipitation associations shows that although JA precipitation in New Mexico is significantly correlated with SWE over much larger areas than in Arizona, the overall strength of the correlations are just as strong in Arizona as in New Mexico. Results from this study also illustrate that the snow–monsoon relationship is unstable over time. In New Mexico, the relationship is strongest during 1965–92 and is weaker outside that period. By contrast, Arizona shows strongest snow–monsoon associations before 1970. The temporal coincidence between stronger snow–monsoon associations over Arizona and weaker snow–monsoon associations over New Mexico (and vice versa) suggests a common forcing mechanism and that the variations in the strength of snow–monsoon associations are more than just climate noise. There is a need to understand how other factors modulate monsoonal rainfall before realistic predictions of summertime precipitation in the Southwest can be made.
Abstract
This paper provides a detailed description of the relationship between spring snow mass in the mountain areas of the western United States and summertime precipitation in the southwestern United States associated with the North American monsoon system and examines the hypothesis that antecedent spring snow mass can modulate monsoon rains through effects on land surface energy balance. Analysis of spring snow water equivalent (SWE) and July–August (JA) precipitation for the period of 1948–97 confirms the inverse snow–monsoon relationship noted in previous studies. Examination of regional difference in SWE–JA precipitation associations shows that although JA precipitation in New Mexico is significantly correlated with SWE over much larger areas than in Arizona, the overall strength of the correlations are just as strong in Arizona as in New Mexico. Results from this study also illustrate that the snow–monsoon relationship is unstable over time. In New Mexico, the relationship is strongest during 1965–92 and is weaker outside that period. By contrast, Arizona shows strongest snow–monsoon associations before 1970. The temporal coincidence between stronger snow–monsoon associations over Arizona and weaker snow–monsoon associations over New Mexico (and vice versa) suggests a common forcing mechanism and that the variations in the strength of snow–monsoon associations are more than just climate noise. There is a need to understand how other factors modulate monsoonal rainfall before realistic predictions of summertime precipitation in the Southwest can be made.
Abstract
An effort is under way aimed at historical analysis and monitoring of the pan-Arctic terrestrial drainage system. A key element is the provision of gridded precipitation time series that can be readily updated. This has proven to be a daunting task. Except for a few areas, the station network is sparse, with large measurement biases due to poor catch efficiency of solid precipitation. The variety of gauges used by different countries along with different reporting practices introduces further uncertainty. Since about 1990, there has been serious degradation of the monitoring network due to station closure and a trend toward automation in Canada.
Station data are used to compile monthly gridded time series for the 30-yr period 1960–89 at a cell resolution of 175 km. The station network is generally sufficient to estimate the mean and standard deviation of precipitation at this scale (hence the statistical distributions). However, as the interpolation procedures must typically draw from stations well outside of the grid box bounds, grid box time series are poorly represented. Accurately capturing time series requires typically four stations per 175-km cell, but only 38% of cells contain even a single station.
Precipitation updates at about a 1-month time lag can be obtained by using the observed precipitation distributions to rescale precipitation forecasts from the NCEP-1 reanalysis via a nonparametric probability transform. While recognizing inaccuracies in the observed time series, cross-validated correlation analyses indicate that the rescaled NCEP-1 forecasts have considerable skill in some parts of the Arctic drainage, but perform poorly over large regions. Treating climatology as a first guess with replacement by rescaled NCEP-1 values in areas of demonstrated skill yields a marginally useful monitoring product on the scale of large watersheds. Further improvements are realized by assimilating data from a limited array of station updates via a simple replacement strategy, and by including aerological estimates of precipitation less evapotranspiration (P − ET) within the initial rescaling procedure. Doing a better job requires better observations and an improved atmospheric model. The new ERA-40 reanalysis may fill the latter need.
Abstract
An effort is under way aimed at historical analysis and monitoring of the pan-Arctic terrestrial drainage system. A key element is the provision of gridded precipitation time series that can be readily updated. This has proven to be a daunting task. Except for a few areas, the station network is sparse, with large measurement biases due to poor catch efficiency of solid precipitation. The variety of gauges used by different countries along with different reporting practices introduces further uncertainty. Since about 1990, there has been serious degradation of the monitoring network due to station closure and a trend toward automation in Canada.
Station data are used to compile monthly gridded time series for the 30-yr period 1960–89 at a cell resolution of 175 km. The station network is generally sufficient to estimate the mean and standard deviation of precipitation at this scale (hence the statistical distributions). However, as the interpolation procedures must typically draw from stations well outside of the grid box bounds, grid box time series are poorly represented. Accurately capturing time series requires typically four stations per 175-km cell, but only 38% of cells contain even a single station.
Precipitation updates at about a 1-month time lag can be obtained by using the observed precipitation distributions to rescale precipitation forecasts from the NCEP-1 reanalysis via a nonparametric probability transform. While recognizing inaccuracies in the observed time series, cross-validated correlation analyses indicate that the rescaled NCEP-1 forecasts have considerable skill in some parts of the Arctic drainage, but perform poorly over large regions. Treating climatology as a first guess with replacement by rescaled NCEP-1 values in areas of demonstrated skill yields a marginally useful monitoring product on the scale of large watersheds. Further improvements are realized by assimilating data from a limited array of station updates via a simple replacement strategy, and by including aerological estimates of precipitation less evapotranspiration (P − ET) within the initial rescaling procedure. Doing a better job requires better observations and an improved atmospheric model. The new ERA-40 reanalysis may fill the latter need.
Abstract
Stations are an important source of meteorological data, but often suffer from missing values and short observation periods. Gap filling is widely used to generate serially complete datasets (SCDs), which are subsequently used to produce gridded meteorological estimates. However, the value of SCDs in spatial interpolation is scarcely studied. Based on our recent efforts to develop a SCD over North America (SCDNA), we explore the extent to which gap filling improves gridded precipitation and temperature estimates. We address two specific questions: 1) Can SCDNA improve the statistical accuracy of gridded estimates in North America? 2) Can SCDNA improve estimates of trends on gridded data? In addressing these questions, we also evaluate the extent to which results depend on the spatial density of the station network and the spatial interpolation methods used. Results show that the improvement in statistical interpolation due to gap filling is more obvious for precipitation, followed by minimum temperature and maximum temperature. The improvement is larger when the station network is sparse and when simpler interpolation methods are used. SCDs can also notably reduce the uncertainties in spatial interpolation. Our evaluation across North America from 1979 to 2018 demonstrates that SCDs improve the accuracy of interpolated estimates for most stations and days. SCDNA-based interpolation also obtains better trend estimation than observation-based interpolation. This occurs because stations used for interpolation could change during a specific period, causing changepoints in interpolated temperature estimates and affect the long-term trends of observation-based interpolation, which can be avoided using SCDNA. Overall, SCDs improve the performance of gridded precipitation and temperature estimates.
Abstract
Stations are an important source of meteorological data, but often suffer from missing values and short observation periods. Gap filling is widely used to generate serially complete datasets (SCDs), which are subsequently used to produce gridded meteorological estimates. However, the value of SCDs in spatial interpolation is scarcely studied. Based on our recent efforts to develop a SCD over North America (SCDNA), we explore the extent to which gap filling improves gridded precipitation and temperature estimates. We address two specific questions: 1) Can SCDNA improve the statistical accuracy of gridded estimates in North America? 2) Can SCDNA improve estimates of trends on gridded data? In addressing these questions, we also evaluate the extent to which results depend on the spatial density of the station network and the spatial interpolation methods used. Results show that the improvement in statistical interpolation due to gap filling is more obvious for precipitation, followed by minimum temperature and maximum temperature. The improvement is larger when the station network is sparse and when simpler interpolation methods are used. SCDs can also notably reduce the uncertainties in spatial interpolation. Our evaluation across North America from 1979 to 2018 demonstrates that SCDs improve the accuracy of interpolated estimates for most stations and days. SCDNA-based interpolation also obtains better trend estimation than observation-based interpolation. This occurs because stations used for interpolation could change during a specific period, causing changepoints in interpolated temperature estimates and affect the long-term trends of observation-based interpolation, which can be avoided using SCDNA. Overall, SCDs improve the performance of gridded precipitation and temperature estimates.
Abstract
Gridded meteorological estimates are essential for many applications. Most existing meteorological datasets are deterministic and have limitations in representing the inherent uncertainties from both the data and methodology used to create gridded products. We develop the Ensemble Meteorological Dataset for Planet Earth (EM-Earth) for precipitation, mean daily temperature, daily temperature range, and dewpoint temperature at 0.1° spatial resolution over global land areas from 1950 to 2019. EM-Earth provides hourly/daily deterministic estimates, and daily probabilistic estimates (25 ensemble members), to meet the diverse requirements of hydrometeorological applications. To produce EM-Earth, we first developed a station-based Serially Complete Earth (SC-Earth) dataset, which removes the temporal discontinuities in raw station observations. Then, we optimally merged SC-Earth station data and ERA5 estimates to generate EM-Earth deterministic estimates and their uncertainties. The EM-Earth ensemble members are produced by sampling from parametric probability distributions using spatiotemporally correlated random fields. The EM-Earth dataset is evaluated by leave-one-out validation, using independent evaluation stations, and comparing it with many widely used datasets. The results show that EM-Earth is better in Europe, North America, and Oceania than in Africa, Asia, and South America, mainly due to differences in the available stations and differences in climate conditions. Probabilistic spatial meteorological datasets are particularly valuable in regions with large meteorological uncertainties, where almost all existing deterministic datasets face great challenges in obtaining accurate estimates.
Abstract
Gridded meteorological estimates are essential for many applications. Most existing meteorological datasets are deterministic and have limitations in representing the inherent uncertainties from both the data and methodology used to create gridded products. We develop the Ensemble Meteorological Dataset for Planet Earth (EM-Earth) for precipitation, mean daily temperature, daily temperature range, and dewpoint temperature at 0.1° spatial resolution over global land areas from 1950 to 2019. EM-Earth provides hourly/daily deterministic estimates, and daily probabilistic estimates (25 ensemble members), to meet the diverse requirements of hydrometeorological applications. To produce EM-Earth, we first developed a station-based Serially Complete Earth (SC-Earth) dataset, which removes the temporal discontinuities in raw station observations. Then, we optimally merged SC-Earth station data and ERA5 estimates to generate EM-Earth deterministic estimates and their uncertainties. The EM-Earth ensemble members are produced by sampling from parametric probability distributions using spatiotemporally correlated random fields. The EM-Earth dataset is evaluated by leave-one-out validation, using independent evaluation stations, and comparing it with many widely used datasets. The results show that EM-Earth is better in Europe, North America, and Oceania than in Africa, Asia, and South America, mainly due to differences in the available stations and differences in climate conditions. Probabilistic spatial meteorological datasets are particularly valuable in regions with large meteorological uncertainties, where almost all existing deterministic datasets face great challenges in obtaining accurate estimates.
Abstract
Meteorological data from ground stations suffer from temporal discontinuities caused by missing values and short measurement periods. Gap-filling and reconstruction techniques have proven to be effective in producing serially complete station datasets (SCDs) that are used for a myriad of meteorological applications (e.g., developing gridded meteorological datasets and validating models). To our knowledge, all SCDs are developed at regional scales. In this study, we developed the serially complete Earth (SC-Earth) dataset, which provides daily precipitation, mean temperature, temperature range, dewpoint temperature, and wind speed data from 1950 to 2019. SC-Earth utilizes raw station data from the Global Historical Climatology Network–Daily (GHCN-D) and the Global Surface Summary of the Day (GSOD). A unified station repository is generated based on GHCN-D and GSOD after station merging and strict quality control. ERA5 is optimally matched with station data considering the time shift issue and then used to assist the global gap filling. SC-Earth is generated by merging estimates from 15 strategies based on quantile mapping, spatial interpolation, machine learning, and multistrategy merging. The final estimates are bias corrected using a combination of quantile mapping and quantile delta mapping. Comprehensive validation demonstrates that SC-Earth has high accuracy around the globe, with degraded quality in the tropics and oceanic islands due to sparse station networks, strong spatial precipitation gradients, and degraded ERA5 estimates. Meanwhile, SC-Earth inherits potential limitations such as inhomogeneity and precipitation undercatch from raw station data, which may affect its application in some cases. Overall, the high-quality and high-density SC-Earth dataset will benefit research in fields of hydrology, ecology, meteorology, and climate. The dataset is available at https://zenodo.org/record/4762586.
Abstract
Meteorological data from ground stations suffer from temporal discontinuities caused by missing values and short measurement periods. Gap-filling and reconstruction techniques have proven to be effective in producing serially complete station datasets (SCDs) that are used for a myriad of meteorological applications (e.g., developing gridded meteorological datasets and validating models). To our knowledge, all SCDs are developed at regional scales. In this study, we developed the serially complete Earth (SC-Earth) dataset, which provides daily precipitation, mean temperature, temperature range, dewpoint temperature, and wind speed data from 1950 to 2019. SC-Earth utilizes raw station data from the Global Historical Climatology Network–Daily (GHCN-D) and the Global Surface Summary of the Day (GSOD). A unified station repository is generated based on GHCN-D and GSOD after station merging and strict quality control. ERA5 is optimally matched with station data considering the time shift issue and then used to assist the global gap filling. SC-Earth is generated by merging estimates from 15 strategies based on quantile mapping, spatial interpolation, machine learning, and multistrategy merging. The final estimates are bias corrected using a combination of quantile mapping and quantile delta mapping. Comprehensive validation demonstrates that SC-Earth has high accuracy around the globe, with degraded quality in the tropics and oceanic islands due to sparse station networks, strong spatial precipitation gradients, and degraded ERA5 estimates. Meanwhile, SC-Earth inherits potential limitations such as inhomogeneity and precipitation undercatch from raw station data, which may affect its application in some cases. Overall, the high-quality and high-density SC-Earth dataset will benefit research in fields of hydrology, ecology, meteorology, and climate. The dataset is available at https://zenodo.org/record/4762586.