Search Results
You are looking at 1 - 10 of 17 items for :
- Author or Editor: Matthew J. Menne x
- Article x
- Refine by Access: All Content x
Abstract
Two statistical tests are described that can be used to detect potential inhomogeneities and errors in daily temperature observations. These tests, based on neighbor comparisons, differ from existing inhomogeneity tests by evaluating daily rather than monthly or annual observations and by focusing on a very short record length. Standardized difference series one month in length are formed between a candidate station, whose daily temperature time series is being evaluated, and a number of neighboring stations. These series, called D-series, approximate white noise when a candidate is like its neighbors and are other than white noise when the candidate is unlike its neighbors. Two white noise tests are then applied to the D-series in order to detect potential problems at the candidate station: a cross-correlation test and a lag 1 (1-day) autocorrelation test. Examples of errors and inhomogeneities detected through the application of the two tests on observations from the National Weather Service's Cooperative Observer Network are provided. These tests were designed specifically to detect inhomogeneities in an operational environment, that is, while data are being routinely processed. When a potential inhomogeneity is identified, timely action can be taken and feedback given, if necessary, to station field managers to prevent further corruption of the data record. While examples are provided using observations from the Cooperative Observer Network, these tests may be used in any temperature observation network with sufficient station density to provide a pool of neighboring stations.
Abstract
Two statistical tests are described that can be used to detect potential inhomogeneities and errors in daily temperature observations. These tests, based on neighbor comparisons, differ from existing inhomogeneity tests by evaluating daily rather than monthly or annual observations and by focusing on a very short record length. Standardized difference series one month in length are formed between a candidate station, whose daily temperature time series is being evaluated, and a number of neighboring stations. These series, called D-series, approximate white noise when a candidate is like its neighbors and are other than white noise when the candidate is unlike its neighbors. Two white noise tests are then applied to the D-series in order to detect potential problems at the candidate station: a cross-correlation test and a lag 1 (1-day) autocorrelation test. Examples of errors and inhomogeneities detected through the application of the two tests on observations from the National Weather Service's Cooperative Observer Network are provided. These tests were designed specifically to detect inhomogeneities in an operational environment, that is, while data are being routinely processed. When a potential inhomogeneity is identified, timely action can be taken and feedback given, if necessary, to station field managers to prevent further corruption of the data record. While examples are provided using observations from the Cooperative Observer Network, these tests may be used in any temperature observation network with sufficient station density to provide a pool of neighboring stations.
Abstract
A procedure is described that provides guidance in determining the number of stations required in a climate observing system deployed to capture temporal variability in the spatial mean of a climate parameter. The method entails reducing the density of an existing station network in a step-by-step fashion and quantifying subnetwork performance at each iteration. Under the assumption that the full network for the study area provides a reasonable estimate of the true spatial mean, this degradation process can be used to quantify the relationship between station density and network performance. The result is a systematic “cost–benefit” relationship that can be used in conjunction with practical constraints to determine the number of stations to deploy.
The approach is demonstrated using temperature and precipitation anomaly data from 4012 stations in the conterminous United States over the period 1971–2000. Results indicate that a U.S. climate observing system should consist of at least 25 quasi-uniformly distributed stations in order to reproduce interannual variability in temperature and precipitation because gains in the calculated performance measures begin to level off with higher station numbers. If trend detection is a high priority, then a higher density network of 135 evenly spaced stations is recommended. Through an analysis of long-term observations from the U.S. Historical Climatology Network, the 135-station solution is shown to exceed the climate monitoring goals of the U.S. Climate Reference Network.
Abstract
A procedure is described that provides guidance in determining the number of stations required in a climate observing system deployed to capture temporal variability in the spatial mean of a climate parameter. The method entails reducing the density of an existing station network in a step-by-step fashion and quantifying subnetwork performance at each iteration. Under the assumption that the full network for the study area provides a reasonable estimate of the true spatial mean, this degradation process can be used to quantify the relationship between station density and network performance. The result is a systematic “cost–benefit” relationship that can be used in conjunction with practical constraints to determine the number of stations to deploy.
The approach is demonstrated using temperature and precipitation anomaly data from 4012 stations in the conterminous United States over the period 1971–2000. Results indicate that a U.S. climate observing system should consist of at least 25 quasi-uniformly distributed stations in order to reproduce interannual variability in temperature and precipitation because gains in the calculated performance measures begin to level off with higher station numbers. If trend detection is a high priority, then a higher density network of 135 evenly spaced stations is recommended. Through an analysis of long-term observations from the U.S. Historical Climatology Network, the 135-station solution is shown to exceed the climate monitoring goals of the U.S. Climate Reference Network.
Abstract
An evaluation of three hypothesis test statistics that are commonly used in the detection of undocumented changepoints is described. The goal of the evaluation was to determine whether the use of multiple tests could improve undocumented, artificial changepoint detection skill in climate series. The use of successive hypothesis testing is compared to optimal approaches, both of which are designed for situations in which multiple undocumented changepoints may be present. In addition, the importance of the form of the composite climate reference series is evaluated, particularly with regard to the impact of undocumented changepoints in the various component series that are used to calculate the composite.
In a comparison of single test changepoint detection skill, the composite reference series formulation is shown to be less important than the choice of the hypothesis test statistic, provided that the composite is calculated from the serially complete and homogeneous component series. However, each of the evaluated composite series is not equally susceptible to the presence of changepoints in its components, which may be erroneously attributed to the target series. Moreover, a reference formulation that is based on the averaging of the first-difference component series is susceptible to random walks when the composition of the component series changes through time (e.g., values are missing), and its use is, therefore, not recommended. When more than one test is required to reject the null hypothesis of no changepoint, the number of detected changepoints is reduced proportionately less than the number of false alarms in a wide variety of Monte Carlo simulations. Consequently, a consensus of hypothesis tests appears to improve undocumented changepoint detection skill, especially when reference series homogeneity is violated. A consensus of successive hypothesis tests using a semihierarchic splitting algorithm also compares favorably to optimal solutions, even when changepoints are not hierarchic.
Abstract
An evaluation of three hypothesis test statistics that are commonly used in the detection of undocumented changepoints is described. The goal of the evaluation was to determine whether the use of multiple tests could improve undocumented, artificial changepoint detection skill in climate series. The use of successive hypothesis testing is compared to optimal approaches, both of which are designed for situations in which multiple undocumented changepoints may be present. In addition, the importance of the form of the composite climate reference series is evaluated, particularly with regard to the impact of undocumented changepoints in the various component series that are used to calculate the composite.
In a comparison of single test changepoint detection skill, the composite reference series formulation is shown to be less important than the choice of the hypothesis test statistic, provided that the composite is calculated from the serially complete and homogeneous component series. However, each of the evaluated composite series is not equally susceptible to the presence of changepoints in its components, which may be erroneously attributed to the target series. Moreover, a reference formulation that is based on the averaging of the first-difference component series is susceptible to random walks when the composition of the component series changes through time (e.g., values are missing), and its use is, therefore, not recommended. When more than one test is required to reject the null hypothesis of no changepoint, the number of detected changepoints is reduced proportionately less than the number of false alarms in a wide variety of Monte Carlo simulations. Consequently, a consensus of hypothesis tests appears to improve undocumented changepoint detection skill, especially when reference series homogeneity is violated. A consensus of successive hypothesis tests using a semihierarchic splitting algorithm also compares favorably to optimal solutions, even when changepoints are not hierarchic.
Abstract
An automated homogenization algorithm based on the pairwise comparison of monthly temperature series is described. The algorithm works by forming pairwise difference series between serial monthly temperature values from a network of observing stations. Each difference series is then evaluated for undocumented shifts, and the station series responsible for such breaks is identified automatically. The algorithm also makes use of station history information, when available, to improve the identification of artificial shifts in temperature data. In addition, an evaluation is carried out to distinguish trend inhomogeneities from abrupt shifts. When the magnitude of an apparent shift attributed to a particular station can be reliably estimated, an adjustment is made for the target series. The pairwise algorithm is shown to be robust and efficient at detecting undocumented step changes under a variety of simulated scenarios with step- and trend-type inhomogeneities. Moreover, the approach is shown to yield a lower false-alarm rate for undocumented changepoint detection relative to the more common use of a reference series. Results from the algorithm are used to assess evidence for trend inhomogeneities in U.S. monthly temperature data.
Abstract
An automated homogenization algorithm based on the pairwise comparison of monthly temperature series is described. The algorithm works by forming pairwise difference series between serial monthly temperature values from a network of observing stations. Each difference series is then evaluated for undocumented shifts, and the station series responsible for such breaks is identified automatically. The algorithm also makes use of station history information, when available, to improve the identification of artificial shifts in temperature data. In addition, an evaluation is carried out to distinguish trend inhomogeneities from abrupt shifts. When the magnitude of an apparent shift attributed to a particular station can be reliably estimated, an adjustment is made for the target series. The pairwise algorithm is shown to be robust and efficient at detecting undocumented step changes under a variety of simulated scenarios with step- and trend-type inhomogeneities. Moreover, the approach is shown to yield a lower false-alarm rate for undocumented changepoint detection relative to the more common use of a reference series. Results from the algorithm are used to assess evidence for trend inhomogeneities in U.S. monthly temperature data.
Abstract
In this paper, the authors present a Bayes factor model for detecting undocumented artificial discontinuities in a network of temperature series. First, they generate multiple difference series for each station with the pairwise comparison approach. Next, they treat the detection problem as a Bayesian model selection problem and use Bayes factors to calculate the posterior probabilities of the discontinuities and estimate their locations in time and space. The model can be applied to large climate networks and realistic temperature series with missing data. The effectiveness of the model is illustrated with two realistic large-scale simulations and four sensitivity analyses. Results from applying the algorithm to observed monthly temperature data from the conterminous United States are also briefly discussed in the context of what is currently known about the nature of biases in the U.S. surface temperature record.
Abstract
In this paper, the authors present a Bayes factor model for detecting undocumented artificial discontinuities in a network of temperature series. First, they generate multiple difference series for each station with the pairwise comparison approach. Next, they treat the detection problem as a Bayesian model selection problem and use Bayes factors to calculate the posterior probabilities of the discontinuities and estimate their locations in time and space. The model can be applied to large climate networks and realistic temperature series with missing data. The effectiveness of the model is illustrated with two realistic large-scale simulations and four sensitivity analyses. Results from applying the algorithm to observed monthly temperature data from the conterminous United States are also briefly discussed in the context of what is currently known about the nature of biases in the U.S. surface temperature record.
Abstract
The evaluation strategies outlined in this paper constitute a set of tools beneficial to the development and documentation of robust automated quality assurance (QA) procedures. Traditionally, thresholds for the QA of climate data have been based on target flag rates or statistical confidence limits. However, these approaches do not necessarily quantify a procedure’s effectiveness at detecting true errors in the data. Rather, as illustrated by way of an “extremes check” for daily precipitation totals, information on the performance of a QA test is best obtained through a systematic manual inspection of samples of flagged values combined with a careful analysis of geographical and seasonal patterns of flagged observations. Such an evaluation process not only helps to document the effectiveness of each individual test, but, when applied repeatedly throughout the development process, it also aids in choosing the optimal combination of QA procedures and associated thresholds. In addition, the approach described here constitutes a mechanism for reassessing system performance whenever revisions are made following initial development.
Abstract
The evaluation strategies outlined in this paper constitute a set of tools beneficial to the development and documentation of robust automated quality assurance (QA) procedures. Traditionally, thresholds for the QA of climate data have been based on target flag rates or statistical confidence limits. However, these approaches do not necessarily quantify a procedure’s effectiveness at detecting true errors in the data. Rather, as illustrated by way of an “extremes check” for daily precipitation totals, information on the performance of a QA test is best obtained through a systematic manual inspection of samples of flagged values combined with a careful analysis of geographical and seasonal patterns of flagged observations. Such an evaluation process not only helps to document the effectiveness of each individual test, but, when applied repeatedly throughout the development process, it also aids in choosing the optimal combination of QA procedures and associated thresholds. In addition, the approach described here constitutes a mechanism for reassessing system performance whenever revisions are made following initial development.
In support of climate monitoring and assessments, the National Oceanic and Atmospheric Administration's (NOAA's) National Climatic Data Center has developed an improved version of the U.S. Historical Climatology Network temperature dataset (HCN version 2). In this paper, the HCN version 2 temperature data are described in detail, with a focus on the quality-assured data sources and the systematic bias adjustments. The bias adjustments are discussed in the context of their effect on U.S. temperature trends from the period 1895–2007 and in terms of the differences between version 2 and its widely used predecessor (now referred to as HCN version 1). Evidence suggests that the collective effect of changes in observation practice at U.S. HCN stations is systematic and of the same order of magnitude as the background climate signal. For this reason, bias adjustments are essential to reducing the uncertainty in U.S. climate trends. The largest biases in the HCN are shown to be associated with changes to the time of observation and with the widespread changeover from liquid-in-glass thermometers to the maximum–minimum temperature system (MMTS). With respect to HCN version 1, HCN version 2 trends in maximum temperatures are similar, while minimum temperature trends are somewhat smaller because of 1) an apparent overcorrection in HCN version 1 for the MMTS instrument change and 2) the systematic effect of undocumented station changes, which were not addressed in HCN version 1.
In support of climate monitoring and assessments, the National Oceanic and Atmospheric Administration's (NOAA's) National Climatic Data Center has developed an improved version of the U.S. Historical Climatology Network temperature dataset (HCN version 2). In this paper, the HCN version 2 temperature data are described in detail, with a focus on the quality-assured data sources and the systematic bias adjustments. The bias adjustments are discussed in the context of their effect on U.S. temperature trends from the period 1895–2007 and in terms of the differences between version 2 and its widely used predecessor (now referred to as HCN version 1). Evidence suggests that the collective effect of changes in observation practice at U.S. HCN stations is systematic and of the same order of magnitude as the background climate signal. For this reason, bias adjustments are essential to reducing the uncertainty in U.S. climate trends. The largest biases in the HCN are shown to be associated with changes to the time of observation and with the widespread changeover from liquid-in-glass thermometers to the maximum–minimum temperature system (MMTS). With respect to HCN version 1, HCN version 2 trends in maximum temperatures are similar, while minimum temperature trends are somewhat smaller because of 1) an apparent overcorrection in HCN version 1 for the MMTS instrument change and 2) the systematic effect of undocumented station changes, which were not addressed in HCN version 1.
Abstract
We describe a fourth version of the Global Historical Climatology Network (GHCN)-monthly (GHCNm) temperature dataset. Version 4 (v4) fulfills the goal of aligning GHCNm temperature values with the GHCN-daily dataset and makes use of data from previous versions of GHCNm as well as data collated under the auspices of the International Surface Temperature Initiative. GHCNm v4 has many thousands of additional stations compared to version 3 (v3) both historically and with short time-delay updates. The greater number of stations as well as the use of records with incomplete data during the base period provides for greater global coverage throughout the record compared to earlier versions. Like v3, the monthly averages are screened for random errors and homogenized to address systematic errors. New to v4, uncertainties are calculated for each station series, and regional uncertainties scale directly from the station uncertainties. Correlated errors in the station series are quantified by running the homogenization algorithm as an ensemble. Additional uncertainties associated with incomplete homogenization and use of anomalies are then incorporated into the station ensemble. Further uncertainties are quantified at the regional level, the most important of which is for incomplete spatial coverage. Overall, homogenization has a smaller impact on the v4 global trend compared to v3, though adjustments lead to much greater consistency than between the unadjusted versions. The adjusted v3 global mean therefore falls within the range of uncertainty for v4 adjusted data. Likewise, annual anomaly uncertainties for the other major independent land surface air temperature datasets overlap with GHCNm v4 uncertainties.
Abstract
We describe a fourth version of the Global Historical Climatology Network (GHCN)-monthly (GHCNm) temperature dataset. Version 4 (v4) fulfills the goal of aligning GHCNm temperature values with the GHCN-daily dataset and makes use of data from previous versions of GHCNm as well as data collated under the auspices of the International Surface Temperature Initiative. GHCNm v4 has many thousands of additional stations compared to version 3 (v3) both historically and with short time-delay updates. The greater number of stations as well as the use of records with incomplete data during the base period provides for greater global coverage throughout the record compared to earlier versions. Like v3, the monthly averages are screened for random errors and homogenized to address systematic errors. New to v4, uncertainties are calculated for each station series, and regional uncertainties scale directly from the station uncertainties. Correlated errors in the station series are quantified by running the homogenization algorithm as an ensemble. Additional uncertainties associated with incomplete homogenization and use of anomalies are then incorporated into the station ensemble. Further uncertainties are quantified at the regional level, the most important of which is for incomplete spatial coverage. Overall, homogenization has a smaller impact on the v4 global trend compared to v3, though adjustments lead to much greater consistency than between the unadjusted versions. The adjusted v3 global mean therefore falls within the range of uncertainty for v4 adjusted data. Likewise, annual anomaly uncertainties for the other major independent land surface air temperature datasets overlap with GHCNm v4 uncertainties.
Abstract
NOAA global surface temperature (NOAAGlobalTemp) is NOAA’s operational global surface temperature product, which has been widely used in Earth’s climate assessment and monitoring. To improve the spatial interpolation of monthly land surface air temperatures (LSATs) in NOAAGlobalTemp from 1850 to 2020, a three-layer artificial neural network (ANN) system was designed. The ANN system was trained by repeatedly randomly selecting 90% of the LSATs from ERA5 (1950–2019) and validating with the remaining 10%. Validations show clear improvements of ANN over the original empirical orthogonal teleconnection (EOT) method: the global spatial correlation coefficient (SCC) increases from 65% to 80%, and the global root-mean-square difference (RMSD) decreases from 0.99° to 0.57°C during 1850–2020. The improvements of SCCs and RMSDs are larger in the Southern Hemisphere than in the Northern Hemisphere and are larger before the 1950s and where observations are sparse. The ANN system was finally fed in observed LSATs, and its output over the global land surface was compared with those from the EOT method. Comparisons demonstrate similar improvements by ANN over the EOT method: The global SCC increased from 78% to 89%, the global RMSD decreased from 0.93° to 0.68°C, and the LSAT variability quantified by the monthly standard deviation (STD) increases from 1.16° to 1.41°C during 1850–2020. While the SCC, RMSD, and STD at the monthly time scale have been improved, long-term trends remain largely unchanged because the low-frequency component of LSAT in ANN is identical to that in the EOT approach.
Significance Statement
The spatial interpolation method of an artificial neural network has greatly improved the accuracy of land surface air temperature reconstruction, which reduces root-mean-square error and increases spatial coherence and variabilities over the global land surface from 1850 to 2020.
Abstract
NOAA global surface temperature (NOAAGlobalTemp) is NOAA’s operational global surface temperature product, which has been widely used in Earth’s climate assessment and monitoring. To improve the spatial interpolation of monthly land surface air temperatures (LSATs) in NOAAGlobalTemp from 1850 to 2020, a three-layer artificial neural network (ANN) system was designed. The ANN system was trained by repeatedly randomly selecting 90% of the LSATs from ERA5 (1950–2019) and validating with the remaining 10%. Validations show clear improvements of ANN over the original empirical orthogonal teleconnection (EOT) method: the global spatial correlation coefficient (SCC) increases from 65% to 80%, and the global root-mean-square difference (RMSD) decreases from 0.99° to 0.57°C during 1850–2020. The improvements of SCCs and RMSDs are larger in the Southern Hemisphere than in the Northern Hemisphere and are larger before the 1950s and where observations are sparse. The ANN system was finally fed in observed LSATs, and its output over the global land surface was compared with those from the EOT method. Comparisons demonstrate similar improvements by ANN over the EOT method: The global SCC increased from 78% to 89%, the global RMSD decreased from 0.93° to 0.68°C, and the LSAT variability quantified by the monthly standard deviation (STD) increases from 1.16° to 1.41°C during 1850–2020. While the SCC, RMSD, and STD at the monthly time scale have been improved, long-term trends remain largely unchanged because the low-frequency component of LSAT in ANN is identical to that in the EOT approach.
Significance Statement
The spatial interpolation method of an artificial neural network has greatly improved the accuracy of land surface air temperature reconstruction, which reduces root-mean-square error and increases spatial coherence and variabilities over the global land surface from 1850 to 2020.
Abstract
Temperature time series for stations in western North Carolina are used to evaluate the potential for an urban signal in the local temperature trend, and to compare a homogeneous temperature record from a mountain-top station to two versions of the lower-tropospheric, satellite-derived temperatures from the Microwave Sounding Unit (MSU). Results regarding the urban signal are in agreement with the conclusion from previous investigations that after a location is urbanized, the local temperature trend is consistent with trends derived from surrounding, more rural stations. With respect to the mountain top and lower-tropospheric temperature comparison, the magnitudes of the two MSU-derived trends for the western North Carolina area are closer to the average annual minimum temperature trend than to the annual average maximum temperature trend.
Abstract
Temperature time series for stations in western North Carolina are used to evaluate the potential for an urban signal in the local temperature trend, and to compare a homogeneous temperature record from a mountain-top station to two versions of the lower-tropospheric, satellite-derived temperatures from the Microwave Sounding Unit (MSU). Results regarding the urban signal are in agreement with the conclusion from previous investigations that after a location is urbanized, the local temperature trend is consistent with trends derived from surrounding, more rural stations. With respect to the mountain top and lower-tropospheric temperature comparison, the magnitudes of the two MSU-derived trends for the western North Carolina area are closer to the average annual minimum temperature trend than to the annual average maximum temperature trend.