Search Results

You are looking at 1 - 10 of 20 items for

  • Author or Editor: Samuel S. P. Shen x
  • Refine by Access: All Content x
Clear All Modify Search
Nicholas R. Cavanaugh and Samuel S. P. Shen

Abstract

The first four statistical moments and their trends are calculated for the average daily surface air temperature (SAT) from 1950 to 2010 using the Global Historical Climatology Network–Daily station data for each season relative to the 1961–90 climatology over the Northern Hemisphere. Temporal variation of daily SAT probability distributions are represented as generalized linear regression coefficients on the mean, standard deviation, skewness, and kurtosis calculated for each 10-yr moving time window from 1950–59 to 2001–10. The climatology and trends of these statistical moments suggest that daily SAT probability distributions are non-Gaussian and are changing in time. The climatology of the first four statistical moments has distinct spatial patterns with large coherent structure for mean and standard deviation and relatively smaller and more regionalized patterns for skewness and kurtosis. The linear temporal trends from 1950 to 2010 of the first four moments also have coherent spatial patterns. The linear temporal trends in the characterizing statistical moments are statistically significant at most locations and have differing spatial patterns for different moments. The regionalized variations specific to higher moments may be related to the climate dynamics that contribute to extremes. The nonzero skewness and kurtosis makes this detailed documentation on the higher statistical moments useful for quantifying climate changes and assessing climate model uncertainties.

Full access
Nicholas R. Cavanaugh and Samuel S. P. Shen

Abstract

This paper explores the effects from averaging weather station data onto a grid on the first four statistical moments of daily minimum and maximum surface air temperature (SAT) anomalies over the entire globe. The Global Historical Climatology Network–Daily (GHCND) and the Met Office Hadley Centre GHCND (HadGHCND) datasets from 1950 to 2010 are examined. The GHCND station data exhibit large spatial patterns for each moment and statistically significant moment trends from 1950 to 2010, indicating that SAT probability density functions are non-Gaussian and have undergone characteristic changes in shape due to decadal variability and/or climate change. Comparisons with station data show that gridded averages always underestimate observed variability, particularly in the extremes, and have altered moment trends that are in some cases opposite in sign over large geographic areas. A statistical closure approach based on the quasi-normal approximation is taken to explore SAT’s higher-order moments and point correlation structure. This study focuses specifically on relating variability calculated from station data to that from gridded data through the moment equations for weighted sums of random variables. The higher-order and nonlinear spatial correlations up to the fourth order demonstrate that higher-order moments at grid scale can be determined approximately by functions of station pair correlations that tend to follow the usual Kolmogorov scaling relation. These results can aid in the development of constraints to reduce uncertainties in climate models and have implications for studies of atmospheric variability, extremes, and climate change using gridded observations.

Full access
Thomas M. Smith, Samuel S. P. Shen, and Ralph R. Ferraro

Abstract

Extended precipitation forecasts, with leads of weeks to seasons, are valuable for planning water use and are produced by the U.S. National Weather Service. Forecast skill tends to be low and any skill improvement could be valuable. Here, methods are discussed for improving statistical precipitation forecasting over the contiguous United States. Monthly precipitation is forecast using predictors from the previous month. Testing shows that improvements are obtained from both improved statistical methods and from the use of satellite-based ocean-area precipitation predictors. The statistical superensemble method gives higher skill compared to traditional statistical forecasting. Ensemble statistical forecasting combines individual forecasts. The proposed superensemble is a weighted mean of many forecasts or of forecasts from different prediction systems and uses the forecast reliability estimate to define weights. The method is tested with different predictors to show its skill and how skill can be improved using additional predictors. Cross validation is used to evaluate the skill. Although predictions are strongly influenced by ENSO, in the superensemble other regions contribute more to the forecast skill. The superensemble optimally combines forecasts based on different predictor regions and predictor types. The contribution from multiple predictor regions improves skill and reduces the ENSO spring barrier. Adding satellite-based ocean-area precipitation predictors noticeably increases forecast skill. The resulting skill is comparable to that from dynamic-model forecasts, but the regions with best forecast skill may be different. This paper shows that the statistical superensemble forecasts may be complementary to dynamic forecasts and that combining them may further increase forecast skill.

Full access
Samuel S. P. Shen, Peter Dzikowski, Guilong Li, and Darren Griffith

Abstract

Soil quality models developed for ecodistrict polygons (EDP) and the polygons of the soil landscapes of Canada (SLC) to monitor the concentration of soil organic matter require daily climate data as an important input. The objectives of this paper are (i) to provide a method that interpolates the daily station data onto the 894 SLC polygons and 150 EDP in the province of Alberta, Canada, so that the interpolated data fit not only climate mean but also climate variability, especially for the precipitation field, and hence can be used as realistic climate input to soil quality models and (ii) to understand the variability of the Alberta daily climate, such as precipitation frequency. The procedure interpolates the station data onto a dense network of grid points and then averages the gridpoint values inside polygons. The procedure and results for maximum temperature, minimum temperature, and precipitation are reported in detail. The interpolation uses the observed daily data for the period 1 January 1961–31 December 1997 (13 514 days) within the latitude–longitude box (45°–64°N, 116°–124°W). Because the precipitation field can have a short spatial correlation length scale and large variability, a hybrid of the methods of inverse-distance weight and nearest-station assignment is developed for interpolating the precipitation data. This method can reliably calculate not only the number of precipitation days per month, but also the precipitation amount for a day. The temperature field has a long spatial correlation scale, and its data are interpolated by the inverse-distance-weight method. Cross-validation shows that the interpolated results on polygons are accurate and appropriate for soil quality models. The computing algorithm uses all the daily observed climate data; despite that, some stations have a very short time record or only summer records.

Full access
Gerald R. North, Samuel S. P. Shen, and Robert Upson

Abstract

This paper examines the sampling characteristics of combining data collected by several low-orbiting satellites attempting to estimate the space–time average of rain rates. The several satellites can have different orbital and swath-width parameters. The satellite overpasses are allowed to make partial coverage snapshots of the grid box with each overpass. Such partial visits are considered in an approximate way, letting each intersection area fraction of the grid box by a particular satellite swath be a random variable with mean and variance parameters computed from exact orbit calculations. The derivation procedure is based upon the spectral minimum mean-square error formalism introduced by North and Nakamoto. By using a simple parametric form for the space–time spectral density, simple formulas are derived for a large number of examples, including the combination of the Tropical Rainfall Measuring Mission with an operational sun-synchronous orbiter. The approximations and results are discussed and directions for future research are summarized.

Full access
Samuel S. P. Shen, Christine K. Lee, and Jay Lawrimore

Abstract

This paper estimates the sampling error variances of gridded monthly U.S. Historical Climatology Network, version 2 (USHCN V2), time-of-observation-biases (TOB)-adjusted data. The analysis of mean surface air temperature (SAT) assesses uncertainties, trends, and the rankings of the hottest and coldest years for the contiguous United States in the period of 1895–2008. Data from the USHCN stations are aggregated onto a 2.5° × 3.5° latitude–longitude grid by an arithmetic mean of the stations inside a grid box. The sampling error variances of the gridded monthly data are estimated for every month and every grid box with data. The gridded data and their sampling error variances are used to calculate the contiguous U.S. averages and their trends and associated uncertainties. The sampling error variances are smaller (mostly less than 0.2°C2) over the eastern United States, where the station density is greater and larger (with values of 1.3°C2 for some grid boxes in the earlier period) over mountain and coastal areas. In the period of 1895–2008, every month from January to December has a positive linear trend. February has the largest trend of 0.162°C (10 yr)−1, and September has the smallest trend at 0.020°C (10 yr)−1. The three hottest (coldest) years measured by the mean SAT over the United States were ranked as 1998, 2006, and 1934 (1917, 1895, and 1912).

Full access
Tyler Tucker, Donata Giglio, Megan Scanderbeg, and Samuel S. P. Shen

Abstract

Since the mid-2000s, the Argo oceanographic observational network has provided near-real-time four-dimensional data for the global ocean for the first time in history. Internet (i.e., the “web”) applications that handle the more than two million Argo profiles of ocean temperature, salinity, and pressure are an active area of development. This paper introduces a new and efficient interactive Argo data visualization and delivery web application named Argovis that is built on a classic three-tier design consisting of a front end, back end, and database. Together these components allow users to navigate 4D data on a world map of Argo floats, with the option to select a custom region, depth range, and time period. Argovis’s back end sends data to users in a simple format, and the front end quickly renders web-quality figures. More advanced applications query Argovis from other programming environments, such as Python, R, and MATLAB. Our Argovis architecture allows expert data users to build their own functionality for specific applications, such as the creation of spatially gridded data for a given time and advanced time–frequency analysis for a space–time selection. Argovis is aimed to both scientists and the public, with tutorials and examples available on the website, describing how to use the Argovis data delivery system—for example, how to plot profiles in a region over time or to monitor profile metadata.

Open access
Samuel S. P. Shen, Gerald R. North, and Kwang-Y. Kim

Abstract

Making use of EOF analysis and statistical optimal averaging techniques, the problem of random sampling error in estimating the global average temperature by a network of surface stations has been investigated. The EOF representation makes it unnecessary to use simplified empirical models of the correlation structure of temperature anomalies. If an adjustable weight is assigned to each station according to the criterion of minimum mean-square error, a formula for this error can be derived that consists of a sum of contributions from successive EOF modes. The EOFs were calculated from both observed data and a noise-forced EBM for the problem of one-year and five-year averages. The mean square statistical sampling error depends on the spatial distribution of the stations, length of the averaging interval, and the choice of the weight for each station data stream. Examples used here include four symmetric configurations of 4 × 4, 6 × 4, 9 × 7, and 20 × 10 stations and the Angell-Korshover configuration. Comparisons with the 100-yr U.K. dataset show that correlations for the time series of the global temperature anomaly average between the full dataset and this study's sparse configurations are rather high. For example, the 63-station Angell-Korshover network with uniform weighting explains 92.7% of the total variance, whereas the same network with optimal weighting can lead to 97.8% explained total variance of the U.K. dataset.

Full access
Samuel S. P. Shen, Gregory P. Behm, Y. Tony Song, and Tangdong Qu

Abstract

This paper provides a spectral optimal gridding (SOG) method to make a dynamically consistent reconstruction of water temperature for the global ocean at different depth levels. The dynamical consistency is achieved by using the basis of empirical orthogonal functions (EOFs) derived from NASA Jet Propulsion Laboratory (JPL) non-Boussinesq ocean general circulation model (OGCM) output at ¼° resolution from 1958 to 2013. A convenient singular value decomposition (SVD) method is used to calculate the EOFs, in order to enable efficient computing for a fine spatial grid globally. These EOFs are used as explainable variables and are regressed against the sparsely distributed in situ ocean temperature data at 33 standard depth levels. The observed data are aggregated onto a 1° latitude–longitude grid at each level from the surface to the 5500-m layer for the period 1950–2014. Three representative temperature reconstruction examples are presented and validated: two 10-m-layer (i.e., the second layer from the surface) reconstructions for January 2008 and January 1998, which are compared with independent sea surface temperature (SST) observations; and one 100-m-layer reconstruction for January 1998, which shows a strong cold anomaly El Niño signal in the western tropical Pacific up to −5°C from 150°E to 140°W. The SOG reconstruction can accurately locate the El Niño signal region in different ocean layers. The SOG reconstruction method is shown reliable and yields satisfactory accuracy even with sparse data. Validation and error analysis indicate that no systematic biases exist in the observed and reconstructed data.

Full access
Thomas M. Smith, Samuel S. P. Shen, Li Ren, and Phillip A. Arkin

Abstract

Uncertainty estimates are computed for a statistical reconstruction of global monthly precipitation that was developed in an earlier publication. The reconstruction combined the use of spatial correlations with gauge precipitation and correlations between precipitation and related data beginning in 1900. Several types of errors contribute to uncertainty, including errors associated with the reconstruction method and input data errors. This reconstruction includes the use of correlated data for the ocean-area first guess, which contributes to much of the uncertainty over those regions. Errors associated with the input data include random, sampling, and bias errors. Random and bias data errors are mostly filtered out of the reconstruction analysis and are the smallest components of the total error. The largest errors are associated with sampling and the method, which together dominate the total error. The uncertainty estimates in this study indicate that (i) over oceans the reconstruction is most reliable in the tropics, especially the Pacific, because of the large spatial scales of ENSO; (ii) over the high-latitude oceans multidecadal variations are fairly reliable, but many month-to-month variations are not; and (iii) over- and near-land errors are much smaller because of local gauge. The reconstruction indicates that the average precipitation increases early in the twentieth century, followed by several decades of multidecadal variations with little trend until near the end of the century, when precipitation again appears to systematically increase. The uncertainty estimates indicate that the average changes over land are most reliable, while over oceans the average change over the reconstruction period is slightly larger than the uncertainty.

Full access