The National Climatic Data Center is committed to archiving and disseminating data of high quality. Automated screening of data has proven to be very effective in isolating suspect and erroneous values in large meteorological data sets. However, manual review by validators is required to judge the validity and correct the data that is rejected by the screens. Since the judgment of the validators affects the quality of the data, the efficacy of their actions is of paramount importance.
Techniques have been developed to measure whether data validators make the proper decision when editing data. Measurement is accomplished by replacing valid data with known errors (so-called “seeds”) and then monitoring the validator's decisions. Procedural details and examples are given.
The measurement program has several benefits: (1) validator performance is quantitatively evaluated; (2) limited inferences about data quality can be made; (3) feedback to the validators identifies training requirements and operational procedures that could be improved; and (4) errors of omission as well as of commission are found. It is important to recognize that seeding does not detect errors inserted into the data by validators. Thus, seeding is but one aspect of a comprehensive surveillance mechanism.