## 1. Introduction

Air temperature is an important factor that influences human activity. Successive, accurate real-time land surface temperature observations can help people recognize the trends occurring related to environmental change. Should there be sufficient reliable data available, one can then assess the changes in temperature distribution maps, and accordingly do further analysis. False information could contaminate the analysis results. The records considered to be in error, however, can occasionally be seen and are a result of mistakes made during the operation of the instruments, the inaccuracies of the instruments, the failure of data transmission, or possibly interference from human activity. It is thus desirable for the data management center to build a quality control (QC) system to ensure that the incoming data are accurate before any further analysis is taken.

The temperature QC issues have been discussed in many articles. For the short-term and/or instantaneous observation check, there are typically three processes emphasized: the range check, the spatial check, and the temporal check (e.g., Graybeal et al. 2004; Feng et al. 2004; Fiebrich et al. 2010; Lussana et al. 2010). The approaches used for implementation, however, differ from system to system because they are regionally adapted and proposed based on different statistical assumptions. The purpose of this study is to introduce the real-time land surface temperature QC program developed by Taiwan’s Central Weather Bureau (CWB). The program is unique, as it presents new thoughts on developing models for data verification, such as employing the theorems of unbiased and minimum error estimators for implementing the spatial check. The proposed verification processes and strategies used to check data are accepted practices across the research community.

The territory investigated in this paper is Taiwan, an island located in eastern Asia, lying on the Tropic of Cancer, as shown in Fig. 1. The island at its widest point is 143 km and has a length of 385 km. The total land area is around 36 193 km^{2}, where more than two-thirds of the landmass is mountainous. The land slopes of the mountains are generally steep. Currently, there are 347 real-time meteorological stations being accessed; 31 of them are the CWB manually operated stations, denoted by the triangular symbol in Fig. 1. The data records for the CWB manually operated stations can be more than 40 years in length. The 280 CWB automated stations, denoted by the hollow circular symbol, have data records from 1 to 27 years in length. The CWB continues to expand its network of automated stations in the south yearly. The other stations, denoted by the cross symbol, indicate stations belonging to other authorities (for the convenience of interpretation, they are to be seen as the cooperating stations), with all containing data records that are more than 10 years old. The stations’ elevations range from 2.0 to 3844.8 m MSL. Most of the stations are situated on plains and hills; only a few are on high mountains and the surrounding islands because of communication difficulties. Among the stations, the CWB manually operated stations offer the most reliable data, while the cooperating stations are generally less reliable.

There are five sections in this paper. Section 1 provides the introduction and context. Section 2 details the algorithms and data verification strategies. Section 3 shows the experimental results for the data from 2014. Section 4 presents the performance evaluation of the spatial check scheme. And the final section offers conclusions and suggestions.

## 2. Data verification strategies

There are three strategies for checking data in the program; they are the range check, the spatial check, and the temporal check. The design of the data verification procedure is shown in Fig. 2. The temperature observation must first pass the range check, followed by the spatial or temporal checks. The temporal check is applied only to the stations that lack sufficient references and hence are unable to carry out the spatial check. The implementation procedures of the strategies are as follows.

### a. Range check

The range check is the primary test that inspects whether the observation falls inside the climatological limits of the station to screen out the obviously incorrect measurements. The limits are determined according to the historical records. Some systems use the world’s most extreme temperature records, −89.4°–93.9°C, as the limits (e.g., Feng et al. 2004; Durre et al. 2010). The historical minimum and maximum temperatures observed in Taiwan have been −18.4° and 40.5°C, respectively. The world’s extreme temperature range is thus too wide to reference. Hubbard et al. (2005) advise using the monthly mean ±3 times the standard deviation as the upper and lower boundaries for the daily temperature verification. The ranges are adjusted according to the month. Meek and Hatfield (1994) and Fiebrich et al. (2010) assert applying a sinusoidal variation to the yearly maximum and minimum temperatures to derive the daily extreme limits. Unfortunately, none of the range check methods mentioned above is suitable for Taiwan, because in addition to solar radiation, there are other factors that could lead to fluctuations of the hourly temperature. Such factors may include mei-yu (precipitation along a persistent stationary front that could last for several days to weeks, typically occurring in May and June), typhoons, frontal passages, thunderstorms, foehn winds, and so on. Hence, less rigid variable limits ought to be considered.

Average lapse rates and limits of sea level temperature in different periods.

### b. Spatial check

The spatial check is a process that examines the deviation between the observation of the target station and the estimate derived from the observations of its neighboring reference stations to judge the reliability of the data (Karataş and Yalçin 2005). The spatial check scheme introduced in this study is specialized, as there is no other subjective assumption applied; however, under the theorems of unbiased and minimum error estimators, the reference stations are allowed to determine their own weights according to their correlations between each other.

The steps used for analysis are as follows.

#### 1) Selecting reference stations

The reference stations are those neighboring stations of which their observations are used to evaluate the target station. The selection of the reference stations is generally based on distance, and some have particular restrictions associated with them. For example, in the spatial regression test, Hubbard et al. (2005) suggest choosing the best five linearly fit neighbors as the references for the daily maximum/minimum air temperature verification. Candidates are selected from the surrounding stations within a radius of 50–150 km, depending on the station density. Hubbard and You (2005) assert at least 10 stations are needed to derive the stable weighted estimate. Candidates are selected from the surrounding stations within the same range of 50 km with an *r*^{2} value larger than 0.5. Unlike the spatial interpolation, where the optimum estimate is demanded, for the spatial check, some good estimates are considered sufficient to approach a satisfactory weighted estimate. Considering that time control is an important issue in real-time data management, all jobs must be done in a short period of time; we thus suggest using five references. Should one or two of the referenced observations be missing; it is still possible to use the remaining references for the spatial check. The spatial check can be regarded as a form of voting, as at least three references must be available. Otherwise, a judgment cannot be determined.

This study employs the Pearson product-moment correlation coefficient (*r*^{2} must be no less than 0.5 (i.e., more than 50% of the total variation of the observations of the target station is accounted for by a linear relationship with the observations made from the reference station).

It notes that the two nearby stations may not necessarily have a similar temperature time series. Figure 5 shows an example. The two subgraphs are the hourly average temperature series of stations D2F230 and C0F861 in periods I and IV, respectively, derived from the observations from the 2010–12 period. Station D2F230 is located in a valley with an elevation of 1734 m, and station C0F861 is on a hill with an elevation of 2215 m MSL. The distance between these two stations is around 17 km. The temperature observations from station D2F230 are typically lower than those from station C0F861 in the early morning hours, while the situation reverses later in the day. The gradient changes for different periods of time. Table 2 shows the correlation coefficients between stations C0F861 and D2F230 in different periods. The mean correlation coefficients are all above 0.8. But if the correlation coefficients are calculated by the hour,

Correlation coefficients between stations D2F230 and C0F861 in different periods. Numbers larger than 0.707 are given in bold.

#### 2) Evaluating reference estimates

#### 3) Evaluating weighted estimate

*j*and

*k*and

*j*, and

*n*. The expected error variance

The parameters are assessed according to recent historical observations. To get statistically valid results, at least 90 samples are needed to do the regressions (to get a stable regression result, the data should be distributed widely). A period contains 2 or 3 months, that is, about 60 or 90 days. Thus, at least 90 samples are to include all data from all days in a period. The CWB continues to construct new stations. For the newly constructed stations that started their operations in the middle of the previous year, 2 years of data are required to fulfill the above-stated condition. While considering that the new stations may also need a period of operation to provide stable observations (missing data may occur occasionally at the beginning stages of a station’s operation), it is preferable to use 3 years of data.

The parameters need to be updated yearly to include the information from the newly constructed stations.

#### 4) Executing examination

### c. Temporal check

The spatial check is not always applicable. Stations located in the high mountains, or on the outlying islands, may be too far from other stations to offer sufficient reference support. For stations that were recently constructed, there are not enough sufficient historical data that can be used to generate spatial check parameters. In these cases, the temporal check is used to evaluate the observations.

The temporal check examines whether the temperature difference of two successive observations (the current and previous records) from a specific station exceeds the temperature variation threshold to judge the rationality of the data. Meek and Hatfield (1994) suggest using a constant value, 6°C, as the hourly temporal check threshold. It, as pointed out by Graybeal et al. (2004), may misidentify the jumps brought up by the frontal passages as errors. It is more reasonable to set the variable thresholds according to the weather condition.

It is known that, although not definitely, there are reasons for the temperature variations. The rapid increases in temperature occur typically in the morning. The sudden drops of the temperature are often the result of rainfall or a frontal passage; the temperature may rebound should it stop raining. In most cases, the variations are regional and the nearby stations may record the same trend. The dramatic rises or drops may not necessarily be erroneous, especially for those observed in the open areas. Accordingly, this paper does not use the commonly seen target-flagging rate skill (Graybeal et al. 2004) to determine the temperature difference thresholds, but instead focuses on finding the ceiling and floor boundaries.

At least two neighboring stations are needed to do the comparison.

## 3. Results of the experiment

The data verification procedure is as shown in Fig. 2. The observation must first pass the range check and then go through the spatial or the temporal check. The temporal check is applied only to the stations that are unable to do the spatial check. If the previous observation is absent, it uses the interpolation value estimated by the universal kriging technique (Maidment 1992) instead for comparison.

Applying the parameters generated from the historical data to examine the hourly temperature, it derives the ratios of the data flagged contributed by the range, spatial, and temporal checks to the total observations to be 0.0079%, 0.0304%, and 0.0011%, respectively. The error rates of the individual stations are shown in Fig. 10. The post-examinations are applied to data from 2014 that were flagged as errors. By comparing the time series of the target station with that of its neighboring stations during the time interval of −48 to + 48 h (centered by the flagged time), it allows for the review of the rationalities by the three data checking schemes. The results show that the QC program proposed in this paper is able to filter out most errors. Figures 11–15 present some examples. Figure 11 is an example of the range check failures from station 467650, Figs. 12 and 13 are examples of the spatial check failures from stations C0AD00 and C0U650, and Figs. 14 and 15 are examples of the temporal check failures from stations 467350 and C0R600. The background information of the stations is listed in Table 3. The stations starting with prefix “46” mostly include CWB manually operated stations, while the stations starting with prefix “C0” designate the CWB automated stations. The other stations are corporate managed.

Figure 11 shows an example of a range check failure for station 467650 (a CWB manually operated station). The temperature dropped abruptly from 20.5° to 0.6°C during 0000–0100 LT 18 May 2014. The low values (denoted by the cross symbol in red) lasted for 60 h until 1300 LT 20 May 2014, when the temperature jumped back to 19.0°C. The lower range check limit for station 467650 in May is 1.6°C; the low value thus failed to pass the range check. From the time series comparison of station 467650 to its neighboring station C0H950 in the period from 0000 LT 17 May to 0000 LT 21 May 2014, it shows that the temperature variations of station C0H950 were relatively mild and smooth. This information may indicate that the low values were incorrect. The errors may have resulted from an instrument or transmission failure.

Figure 12 shows an example of a spatial check failure for station C0AD00. The data flagged as errors were those observed from 2000 to 2300 LT 18 February, as denoted by the red crosses in Fig. 12. It uses the data from 2000 LT to explain the case. According to the correlation, there were three reference stations selected, given as C0AD10, C0A920, and C0A931. The correlation coefficients between the target and the reference stations at 2000 LT in period I are 0.962, 0.962, and 0.939, respectively. From the figure, it is observed that the temperature variations of these four stations were generally similar, except for outlying data. At 2000 LT, the temperature observed at the target station C0AD00 was 19.1°C, and the temperatures at the reference stations C0AD10, C0A920, and C0A931 were, 13.8°, 12.4°, and 12.5°C, respectively. The estimates derived from the corresponding linear regression equations [Eq. (3)] are 13.46°, 13.99°, and 14.62°C, and the weights solved from the simultaneous equations [Eq. (6)] are 0.562, 0.358, and 0.08, respectively. From Eqs. (4) and (8), the weighted estimate is 13.74°C, and the expected mean square error variance is 0.598. The 99.95% confidence interval is 9.44°–18.04°C. As the observation from station C0AD00 (19.1°C) was outside the confidence interval, it was thus regarded as an error. The temperature rise at station C0AD00 from 1900 to 2000 LT was 6.3°C, which was larger than the temperature difference threshold (6.0°C; refer to Fig. 8). Accordingly, if it were put into the temporal check, it would also be unable to pass. There is no known condition that could lead to a sudden rise in the temperature during a winter night. This information may indicate and provide a level of confidence that the outlying data were incorrect.

Figure 13 shows another example of a spatial check failure. From the time series comparison, it is observed that the data of station C0U650 in the period from 1800 LT 28 March to 0800 LT 29 March were obviously incorrect. A 4.2°C rise was observed at station C0U650 from 1700 to 1800 LT 28 March. Although the value was lower than the temperature difference threshold, the temperatures of its neighboring stations—C0U640, 467080, and C0U520—showed drops and detected the temperature differences of −1.0°, −1.2°, and −1.0°C, respectively. If the data were put into the temporal check, according to the conditions in Eqs. (12) and (13), then the data from station C0U650 at 1800 LT would also be unable to pass the check.

Figures 14 and 15 are the examples of temporal check failures. Most of the temporal check failures in 2014 were the result of the temperature difference going beyond the threshold. The example presented in Fig. 14 is the perfect example to represent this type of failure. Station 467350 is a solo station located on a surrounding island; there is no suitable station nearby that can be used as a reference.

Figure 15 shows the time series comparison of station C0R600 to its neighboring stations. The observations of station C0R600 at 2000 and 2100 LT 31 March were identified as errors. The former error identification was the result of the temperature difference (24.2° − 17.6° = 6.6°C) between the observations collected at 2000 and 1900 LT being larger than the threshold (6.0°C; refer to Fig. 8), and the latter also exceeded the threshold based on the result of the temperature difference (24.2° − 17.2° = 7.0°C) between the observation collected at 2100 LT and the interpolation value from 2000 LT estimated by the universal kriging technique (since the observation at 2000 LT was rejected). Station C0R600 is relatively new; construction was completed on 1 December 2013.

## 4. Spatial check scheme performance evaluation

Among the three data verification strategies, the primary scheme used is the spatial check method. To evaluate the performance of the new spatial check scheme (SCS) proposed in this paper (for the convenience of interpretation, it is called the CWBSCS), the comparisons for the CWBSCS to the spatial regression test (SRT), known as one of the most efficient spatial check schemes introduced by Hubbard et al. (2005) and You et al. (2008), are presented in this section. There are two concerns: the first is the ability for estimation and the second is the ability for error identification.

### a. Estimation comparisons

As seen in the Eqs. (14) and (16), the SRT does not take into the account the correlation between the reference stations; that is, the scheme is developed under the assumption that the references are independent of each other. It is desirable to study how the correlation plays a role in influencing the estimation results. To ensure the experiment’s data request is fulfilled, fake data are used for testing. The data generation processes are described in appendix B. There are 41 stations involved: one station is the target and the other 40 stations are the references that are distributed randomly within a range of 40 km from the target, as shown in Fig. 16. The stations’ variances are all assumed to equal 9.0—this is a value that approaches the mean value of the variances of the real observations; the correlation length parameter

Figure 17 shows the variations of the average weighted estimate associated with the number of references (

Figures 18, 19, and 20 show the estimation results given by the samples with

Figure 21 shows the RMSE plot given by the real data observed in 2014, where the RMSEs are computed from all the data available without time and space separation. As expected, the CWBSCS performs the best with the least estimation errors; the curve becomes flat when

### b. Error identification comparisons

The error identification evaluation is implemented by seeding some known errors to the clean dataset to see how many seeded errors are flagged, which allows for judging the error identification evaluation’s ability of the QC scheme (Hubbard et al. 2005). To ensure the data are cleansed before seeding, it uses also the fake data to carry out the test.

The 5 nearest stations in the 40 randomly distributed surrounding stations within the range of 40 km from the target are selected as the references. Initially, 10 000 sets of samples are used to determine the

The data check experiment is repeated 500 times. In each experiment, all data and parameters, including the station locations, are rearranged. A sample size of 270 sets is generated to calculate the parameters, and another 10 000 sets of samples are generated for examination. Of the 10 000 samples of the target, 1000 randomly selected samples are seeded with errors. Following the suggestion of Hubbard et al. (2005), the error is taken as the value of

Some experiments for the real data observed from four manually operated stations, located in the northern, central, southern, and eastern regions of Taiwan (refer to Table 3), in 2014 are also performed. The

## 5. Conclusions and suggestions

In this paper, a QC program for the real-time hourly temperature observation developed by the Central Weather Bureau in Taiwan is introduced. There are three data verification strategies involved, which are the range check, the spatial check, and the temporal check.

The range check scheme inspects whether the observation falls inside the climatological limits of the station to screen out obvious outliers. According to the climate conditions in Taiwan, the year is divided into five periods, given as December–February, March and April, May and June, July–September, and October and November. The temperatures limits for research are derived from each period. Station elevation is also taken into account and the limits are adjusted accordingly.

The spatial check scheme scrutinizes whether the observation falls inside the derived confidence interval, according to the data from the reference stations and the correlations among the stations, to judge the reliability of the data. The spatial check scheme introduced in this study is specialized, as it employs the theorems of unbiased and minimum error estimators to determine the weights. Compared with the existing spatial regression test (You et al. 2008) that was developed under the assumption that the references are independent of each other, the new spatial check scheme proposed in this paper performs better in the ability for both estimation and error identification, be it the fake data or the real data tested. It can then conclude that while taking into account the correlations between the references, they do play a role in influencing the estimation results. As a result, the new spatial check scheme is in theory superior to the spatial regression test. As the correlations between the stations are time dependent, the parameters are derived period by period and also hour by hour.

The temporal check scheme examines whether the temperature difference of two successive observations from an individual station exceeds the temperature variation threshold for judging the rationality of the data. Different thresholds are applied for the data observed in different times under different rainfall conditions. Further, for the station that has already passed the first stage temperature difference examination, an additional “step consistency” test is applied to enhance the ability to identify errors.

Procedurally, the observation must pass the range check first before it passes through the spatial check or the temporal check. Time control is an important issue in real-time data management, as all jobs must be done in a short period of time. The range check is only a primary test that can help promptly screen out obvious outliers. The data that have passed the range check are not immediately confirmed as correct; they will be evaluated further.

The spatial check is the main data verification strategy used. Compared to the temporal check scheme, the theory is comprehensive and is much more reliable. The applicability of the spatial check, however, is dependent on the density of the network. As new stations are constructed, the parameters will need to be updated yearly to improve the performance of the QC program.

The temporal check is applied only to the stations that are unable to do the spatial checks, such as those located on the surrounding islands or the recently constructed. To calculate the temperature difference, if the previous observation is absent, it uses the interpolation value estimated by the universal kriging technique instead, for comparison. It is noticed that the spatial check scheme proposed in this paper is possibly universally applicable, while the temporal check scheme is only regionally adaptable.

From post-examinations to the data collected up until 2014, as well as the experiments obtained from online monitoring, it is shown that the QC program proposed in this paper is able to filter out most of the significant errors. The real-time data verification is difficult to conduct, since the events in the following hour are totally unknown; the online QC program has been working well since it was built.

The data checking schemes were developed based on statistics; thus, uncertainty does exist. To handle unexpected problems, Taiwan’s CWB has built a mobile application that can send related information to researchers automatically once data are flagged. This procedure allows experts to make judgments to alter the flags.

For future development, more studies for the improvement of the temporal check scheme are needed to reinforce the reliability of the scheme. Determining the threshold is subjective and time consuming. An objective, automated, and localized procedure for selecting the appropriate thresholds is required. The prescribed rules for checking the data under the unanticipated weather conditions, such as a weather front or a foehn wind, are also desired. These events will be taken into consideration in the future.

This project was funded by Taiwan’s Central Weather Bureau. Special thanks to Deputy Director Dr. Kuo Chen Lu and the authorities for their full support and assistance.

# APPENDIX A

## The Weights Derivation Algorithms

*j*is the reference station, respectively. According to the theorem of the unbiased estimator, it confines thatSubstituting Eqs. (A1) and (A2) into the expression for the expected value of the error sum of squares

*k*and

*j*. The simultaneous equations given above can be expressed in a matrix form:

# APPENDIX B

## The Fake Data Generation Processes

The fake data generation processes are as follows.

- In statistics, a time series variable
can be taken as the sum of a deterministic term , which can be the trend or the periodicity of the station, and a stochastic term . Should there be stations, the time series variables can be expressed as , . The expected value of is , and the variance of is . Assuming the correlation coefficient between any two stations is the distance related to only and follows the exponential form, then the covariance of and can be given as , where is the distance between station and , and is a constant length parameter. Let be the stochastic vector and be the covariance matrix:where is a Hermitian matrix that, by using the Cholesky decomposition technique, can be decomposed into the product of a lower triangular matrix and its conjugate transpose , . The random sample of can then be derived by multiplying by a Gaussian white noise vector generated randomly by the Monte Carlo experiment, , where . - Let station 0 be the target station, and station
be the reference station, . The reference station relates to the target station with a linear relationship; hence, can also be the variation of station from the mean value of station 0. Repeating the random sampling experiment with a sufficient number of times, it can then use the derived values to calculate the weights of the references.

## REFERENCES

Durre, I., , Menne M. J. , , Gleason B. , , Houston T. G. , , and Vose R. S. , 2010: Comprehensive automated quality control of daily surface observations.

,*J. Appl. Meteor. Climatol.***49**, 1615–1633, doi:10.1175/2010JAMC2375.1.Feng, S., , Hu Q. , , and Qian W. , 2004: Quality control of daily meteorological data in China 1951–2000: A new dataset.

,*Int. J. Climatol.***24**, 853–870, doi:10.1002/joc.1047.Fiebrich, C. A., , Morgan C. , , Mccombs A. G. , , Hall P. K. Jr., , and McPherson R. A. , 2010: Quality control procedures for mesoscale meteorological data.

,*J. Atmos. Oceanic Technol.***27**, 1565–1582, doi:10.1175/2010JTECHA1433.1.Graybeal, D. Y., , DeGaetano A. T. , , and Eggleston K. L. , 2004: Improved quality control for historical hourly temperature and humidity: Development and application to environmental analysis.

,*J. Appl. Meteor. Climatol.***43**, 1722–1735, doi:10.1175/JAM2162.1.Hsu, H. H., , and Chen C. T. , 2002: Observed and projected climate change in Taiwan.

,*Meteor. Atmos. Phys.***79**, 87–104, doi:10.1007/s703-002-8230-x.Hubbard, K. G., , and You J. , 2005: Sensitivity analysis of quality control using the spatial regression approach—A case study of the maximum/minimum air temperature.

,*J. Atmos. Oceanic Technol.***22**, 1520–1530, doi:10.1175/JTECH1790.1.Hubbard, K. G., , Goddard S. , , Sorensen W. D. , , Wells N. , , and Osugi T. T. , 2005: Performance of quality control procedures for an applied climate information system.

,*J. Atmos. Oceanic Technol.***22**, 105–112, doi:10.1175/JTECH-1657.1.Karataş, S., , and Yalçin L. , 2005: Data quality management.

*WMO Tech. Conf. on Meteorological and Environmental Instruments and Methods of Observation*, Bucharest, Romania, WMO, P3(33). [Available online at https://www.wmo.int/pages/prog/www/IMOP/publications/IOM-82-TECO_2005/Posters/P3%2833%29_Turkey_5_Karatas.pdf.]Lussana, C., , Uboldi F. , , and Salvati M. R. , 2010: A spatial consistency test for surface observations from mesoscale meteorological networks.

,*Quart. J. Roy. Meteor. Soc.***136**, 1075–1088, doi:10.1002/qj.622.Maidment, D. R., 1992:

*Handbook of Hydrology*. McGraw-Hill, 1424 pp.Meek, D. W., , and Hatfield J. L. , 1994: Data quality checking for single station meteorological databases.

,*Agric. For. Meteor.***69**, 85–109, doi:10.1016/0168-1923(94)90083-3.Menne, M. J., , and Duchon C. E. , 2001: A method for monthly detection of inhomogeneities and errors in daily maximum and minimum temperatures.

,*J. Atmos. Oceanic Technol.***18**, 1136–1149, doi:10.1175/1520-0426(2001)018<1136:AMFMDO>2.0.CO;2.You, J., , Hubbard K. G. , , and Goddard S. , 2008: Comparison of methods for spatially estimating station temperature in a quality control system.

,*Int. J. Climatol.***28**, 777–787, doi:10.1002/joc.1571.