A Deterministic Approach to the Validation of Historical Daily Temperature and Precipitation Data from the Cooperative Network

Thomas Reek National Oceanic and Atmospheric Administration, National Climatic Data Center, Asheville, North Carolina

Search for other papers by Thomas Reek in
Current site
Google Scholar
PubMed
Close
,
Stephen R. Doty National Oceanic and Atmospheric Administration, National Climatic Data Center, Asheville, North Carolina

Search for other papers by Stephen R. Doty in
Current site
Google Scholar
PubMed
Close
, and
Timothy W. Owen National Oceanic and Atmospheric Administration, National Climatic Data Center, Asheville, North Carolina

Search for other papers by Timothy W. Owen in
Current site
Google Scholar
PubMed
Close
Full access

It is widely known that the TD3200 (Summary of the Day Cooperative Network) database held by the National Climatic Data Center contains tens of thousands of erroneous daily values resulting from data-entry, data-recording, and data-reformatting errors. TD3200 serves as a major baseline dataset for detecting global climate change. It is of paramount importance to the climate community that these data be as error-free as possible. Many of these errors are systematic in nature. If a deterministic approach is taken, using empirically developed criteria, many if not most of these errors can be corrected or removed. A computer program utilizing Backus Normal Form structure design and a series of chain-linked tests in the form of encoded rules has been developed as a means of modeling the human subjective process of inductive data review. This objective automated correction process has proven extremely effective. A manual review and validation of 138 stations of a 1300-station subset of TD3200 data closely matched the automated correction process. Applications of this technique are expected to be utilized in the production of a nearly error-free TD3200 dataset.

It is widely known that the TD3200 (Summary of the Day Cooperative Network) database held by the National Climatic Data Center contains tens of thousands of erroneous daily values resulting from data-entry, data-recording, and data-reformatting errors. TD3200 serves as a major baseline dataset for detecting global climate change. It is of paramount importance to the climate community that these data be as error-free as possible. Many of these errors are systematic in nature. If a deterministic approach is taken, using empirically developed criteria, many if not most of these errors can be corrected or removed. A computer program utilizing Backus Normal Form structure design and a series of chain-linked tests in the form of encoded rules has been developed as a means of modeling the human subjective process of inductive data review. This objective automated correction process has proven extremely effective. A manual review and validation of 138 stations of a 1300-station subset of TD3200 data closely matched the automated correction process. Applications of this technique are expected to be utilized in the production of a nearly error-free TD3200 dataset.

Save