Search Results
You are looking at 1 - 5 of 5 items for :
- Author or Editor: Russell S. Vose x
- Journal of Applied Meteorology and Climatology x
- Refine by Access: All Content x
Abstract
The evaluation strategies outlined in this paper constitute a set of tools beneficial to the development and documentation of robust automated quality assurance (QA) procedures. Traditionally, thresholds for the QA of climate data have been based on target flag rates or statistical confidence limits. However, these approaches do not necessarily quantify a procedure’s effectiveness at detecting true errors in the data. Rather, as illustrated by way of an “extremes check” for daily precipitation totals, information on the performance of a QA test is best obtained through a systematic manual inspection of samples of flagged values combined with a careful analysis of geographical and seasonal patterns of flagged observations. Such an evaluation process not only helps to document the effectiveness of each individual test, but, when applied repeatedly throughout the development process, it also aids in choosing the optimal combination of QA procedures and associated thresholds. In addition, the approach described here constitutes a mechanism for reassessing system performance whenever revisions are made following initial development.
Abstract
The evaluation strategies outlined in this paper constitute a set of tools beneficial to the development and documentation of robust automated quality assurance (QA) procedures. Traditionally, thresholds for the QA of climate data have been based on target flag rates or statistical confidence limits. However, these approaches do not necessarily quantify a procedure’s effectiveness at detecting true errors in the data. Rather, as illustrated by way of an “extremes check” for daily precipitation totals, information on the performance of a QA test is best obtained through a systematic manual inspection of samples of flagged values combined with a careful analysis of geographical and seasonal patterns of flagged observations. Such an evaluation process not only helps to document the effectiveness of each individual test, but, when applied repeatedly throughout the development process, it also aids in choosing the optimal combination of QA procedures and associated thresholds. In addition, the approach described here constitutes a mechanism for reassessing system performance whenever revisions are made following initial development.
Abstract
This paper presents a description of the fully automated quality-assurance (QA) procedures that are being applied to temperatures in the Integrated Global Radiosonde Archive (IGRA). Because these data are routinely used for monitoring variations in tropospheric temperature, it is of critical importance that the system be able to detect as many errors as possible without falsely identifying true meteorological events as erroneous. Three steps were taken to achieve such robust performance. First, 14 tests for excessive persistence, climatological outliers, and vertical and temporal inconsistencies were developed and arranged into a deliberate sequence so as to render the system capable of detecting a variety of data errors. Second, manual review of random samples of flagged values was used to set the “thresholds” for each individual check so as to minimize the number of valid values that are mistakenly identified as errors. The performance of the system as a whole was also assessed through manual inspection of random samples of the quality-assured data. As a result of these efforts, the IGRA temperature QA procedures effectively remove the grossest errors while maintaining a false-positive rate of approximately 10%.
Abstract
This paper presents a description of the fully automated quality-assurance (QA) procedures that are being applied to temperatures in the Integrated Global Radiosonde Archive (IGRA). Because these data are routinely used for monitoring variations in tropospheric temperature, it is of critical importance that the system be able to detect as many errors as possible without falsely identifying true meteorological events as erroneous. Three steps were taken to achieve such robust performance. First, 14 tests for excessive persistence, climatological outliers, and vertical and temporal inconsistencies were developed and arranged into a deliberate sequence so as to render the system capable of detecting a variety of data errors. Second, manual review of random samples of flagged values was used to set the “thresholds” for each individual check so as to minimize the number of valid values that are mistakenly identified as errors. The performance of the system as a whole was also assessed through manual inspection of random samples of the quality-assured data. As a result of these efforts, the IGRA temperature QA procedures effectively remove the grossest errors while maintaining a false-positive rate of approximately 10%.
Abstract
This paper describes a comprehensive set of fully automated quality assurance (QA) procedures for observations of daily surface temperature, precipitation, snowfall, and snow depth. The QA procedures are being applied operationally to the Global Historical Climatology Network (GHCN)-Daily dataset. Since these data are used for analyzing and monitoring variations in extremes, the QA system is designed to detect as many errors as possible while maintaining a low probability of falsely identifying true meteorological events as erroneous. The system consists of 19 carefully evaluated tests that detect duplicate data, climatological outliers, and various inconsistencies (internal, temporal, and spatial). Manual review of random samples of the values flagged as errors is used to set the threshold for each procedure such that its false-positive rate, or fraction of valid values identified as errors, is minimized. In addition, the tests are arranged in a deliberate sequence in which the performance of the later checks is enhanced by the error detection capabilities of the earlier tests. Based on an assessment of each individual check and a final evaluation for each element, the system identifies 3.6 million (0.24%) of the more than 1.5 billion maximum/minimum temperature, precipitation, snowfall, and snow depth values in GHCN-Daily as errors, has a false-positive rate of 1%−2%, and is effective at detecting both the grossest errors as well as more subtle inconsistencies among elements.
Abstract
This paper describes a comprehensive set of fully automated quality assurance (QA) procedures for observations of daily surface temperature, precipitation, snowfall, and snow depth. The QA procedures are being applied operationally to the Global Historical Climatology Network (GHCN)-Daily dataset. Since these data are used for analyzing and monitoring variations in extremes, the QA system is designed to detect as many errors as possible while maintaining a low probability of falsely identifying true meteorological events as erroneous. The system consists of 19 carefully evaluated tests that detect duplicate data, climatological outliers, and various inconsistencies (internal, temporal, and spatial). Manual review of random samples of the values flagged as errors is used to set the threshold for each procedure such that its false-positive rate, or fraction of valid values identified as errors, is minimized. In addition, the tests are arranged in a deliberate sequence in which the performance of the later checks is enhanced by the error detection capabilities of the earlier tests. Based on an assessment of each individual check and a final evaluation for each element, the system identifies 3.6 million (0.24%) of the more than 1.5 billion maximum/minimum temperature, precipitation, snowfall, and snow depth values in GHCN-Daily as errors, has a false-positive rate of 1%−2%, and is effective at detecting both the grossest errors as well as more subtle inconsistencies among elements.
Abstract
The 1981–2010 “U.S. Climate Normals” released by the National Oceanic and Atmospheric Administration’s (NOAA) National Climatic Data Center include a suite of monthly, seasonal, and annual statistics that are based on precipitation, snowfall, and snow-depth measurements. This paper describes the procedures used to calculate the average totals, frequencies of occurrence, and percentiles that constitute these normals. All parameters were calculated from a single, state-of-the-art dataset of daily observations, taking care to produce normals that were as representative as possible of the full 1981–2010 period, even when the underlying data records were incomplete. In the resulting product, average precipitation totals are available at approximately 9300 stations across the United States and parts of the Caribbean Sea and Pacific Ocean islands. Snowfall and snow-depth statistics are provided for approximately 5300 of those stations, as compared with several hundred stations in the 1971–2000 normals. The 1981–2010 statistics exhibit the familiar climatological patterns across the contiguous United States. When compared with the same calculations for 1971–2000, the later period is characterized by a smaller number of days with snow on the ground and less total annual snowfall across much of the contiguous United States; wetter conditions over much of the Great Plains, Midwest, and northern California; and drier conditions over much of the Southeast and Pacific Northwest. These differences are a reflection of the removal of the 1970s and the addition of the 2000s to the 30-yr-normals period as part of this latest revision of the normals.
Abstract
The 1981–2010 “U.S. Climate Normals” released by the National Oceanic and Atmospheric Administration’s (NOAA) National Climatic Data Center include a suite of monthly, seasonal, and annual statistics that are based on precipitation, snowfall, and snow-depth measurements. This paper describes the procedures used to calculate the average totals, frequencies of occurrence, and percentiles that constitute these normals. All parameters were calculated from a single, state-of-the-art dataset of daily observations, taking care to produce normals that were as representative as possible of the full 1981–2010 period, even when the underlying data records were incomplete. In the resulting product, average precipitation totals are available at approximately 9300 stations across the United States and parts of the Caribbean Sea and Pacific Ocean islands. Snowfall and snow-depth statistics are provided for approximately 5300 of those stations, as compared with several hundred stations in the 1971–2000 normals. The 1981–2010 statistics exhibit the familiar climatological patterns across the contiguous United States. When compared with the same calculations for 1971–2000, the later period is characterized by a smaller number of days with snow on the ground and less total annual snowfall across much of the contiguous United States; wetter conditions over much of the Great Plains, Midwest, and northern California; and drier conditions over much of the Southeast and Pacific Northwest. These differences are a reflection of the removal of the 1970s and the addition of the 2000s to the 30-yr-normals period as part of this latest revision of the normals.
Abstract
This paper describes an improved edition of the climate division dataset for the conterminous United States (i.e., version 2). The first improvement is to the input data, which now include additional station networks, quality assurance reviews, and temperature bias adjustments. The second improvement is to the suite of climatic elements, which now includes both maximum and minimum temperatures. The third improvement is to the computational approach, which now employs climatologically aided interpolation to address topographic and network variability. Version 2 exhibits substantial differences from version 1 over the period 1895–2012. For example, divisional averages in version 2 tend to be cooler and wetter, particularly in mountainous areas of the western United States. Division-level trends in temperature and precipitation display greater spatial consistency in version 2. National-scale temperature trends in version 2 are comparable to those in the U.S. Historical Climatology Network whereas version 1 exhibits less warming as a result of historical changes in observing practices. Divisional errors in version 2 are likely less than 0.5°C for temperature and 20 mm for precipitation at the start of the record, falling rapidly thereafter. Overall, these results indicate that version 2 can supersede version 1 in both operational climate monitoring and applied climatic research.
Abstract
This paper describes an improved edition of the climate division dataset for the conterminous United States (i.e., version 2). The first improvement is to the input data, which now include additional station networks, quality assurance reviews, and temperature bias adjustments. The second improvement is to the suite of climatic elements, which now includes both maximum and minimum temperatures. The third improvement is to the computational approach, which now employs climatologically aided interpolation to address topographic and network variability. Version 2 exhibits substantial differences from version 1 over the period 1895–2012. For example, divisional averages in version 2 tend to be cooler and wetter, particularly in mountainous areas of the western United States. Division-level trends in temperature and precipitation display greater spatial consistency in version 2. National-scale temperature trends in version 2 are comparable to those in the U.S. Historical Climatology Network whereas version 1 exhibits less warming as a result of historical changes in observing practices. Divisional errors in version 2 are likely less than 0.5°C for temperature and 20 mm for precipitation at the start of the record, falling rapidly thereafter. Overall, these results indicate that version 2 can supersede version 1 in both operational climate monitoring and applied climatic research.