## 1. Introduction

A continually increasing number of meteorological observation sites is producing larger and larger amounts of data. Meteorologists can only benefit from this extensive quantity of measurements if the data quality meets the requirements implied by the intended applications. On the one hand, high quality long-term observational data are essential for identifying climate changes or for validating climate model simulations (Feng et al. 2004). On the other hand, quality controlled real-time data are fundamental for nowcasting and model validation and furthermore are used to provide proper initial conditions for numerical weather prediction (Ingleby and Lorenc 1993).

Quality control (QC) of meteorological data is a quite young discipline. Until the early stages of the numerical weather prediction (NWP) movement, only slight attention had been paid to data quality and the QC process had been considered to be an unglamorous task (Gandin 1988). During the second half of the last century, the progress of NWP models brought about the recognition of the importance of QC. The manual inspection of observations was followed by simple QC algorithms that used empirically tested adjustments (Lorenc and Hammon 1988). Increasing computer power enabled the development of more complex QC methods, which will be summarized in order to classify the QC method presented in this paper. Nowadays, QC is not only an essential part of the acquisition, transmission, and processing of observational data, but it is also strictly recommended by different guides from the World Meteorological Organization in order to achieve a certain standard regarding the international exchange of data (WMO 2008).

This article presents a new QC method developed at the Department of Meteorology and Geophysics at the University of Vienna, the Vienna Enhanced Resolution Analysis Quality Control (VERA-QC). In comparison to other approaches it does neither require any previous knowledge nor prognostic model information. Thus, VERA-QC is especially suited for model validation. The moniker VERA-QC is derived from its application as a preprocessing tool of the department’s operational analysis of basic meteorological parameters (VERA) carried out on an hourly basis (Pöttschacher et al. 1996). In section 2 an overview of the types of errors and current common QC methods is given to provide a basis for the classification of the QC procedure presented in this paper. Section 3 describes the VERA-QC method in detail, points out common problems and difficulties associated with this special QC approach, and offers solutions, accompanied by idealized two-dimensional examples. Section 4 offers one-, two-, and three-dimensional examples with artificial observations for simple and more complex station distributions to present the properties characterizing the VERA-QC method. A comparison of the performances of VERA-QC and two other QC methods is carried out in section 5. The article is completed by section 6, which presents our conclusions and offers an outlook into further planned developments and applications.

## 2. Error types and QC methods

### a. Error types

Before summarizing the most common QC procedures, one should be aware of their purpose, which is to recognize errors and to decide how to cope with them. Measurements are naturally affected by different kinds of errors. Although there is no standardized classification scheme, the so-called observational errors are usually divided into the following four main types:

- Random errors—These errors are caused by the fact that an instrument can only give an approximation of nature. Furthermore, variations in other parameters can influence the measurements. As the great number of independent factors is governed by the law of large numbers, these errors can be treated like Gaussian-distributed random numbers around zero (Gandin 1988).
- Systematic errors—In addition to the white noise, systematic errors occur mainly due to calibration errors or long-term drifts of sensors. Because of their usual persistence in time and their asymmetric distribution around zero, they add a bias to the measured parameter.
- Micrometeorological errors—Spatial and temporal dimensions of meteorological phenomena cover a wide range of scales. Phenomena belonging to smaller scales than resolvable by the observational network result in micrometeorological errors. Although a measured value can be considered to be correct, such small-scale perturbations result in a misleading analysis because of the nonrepresentativeness of the single observation. Depending on the cause, such as subscale meteorological effects (e.g., urban heat islands) or meteorological noise (e.g., random subscale effects caused by turbulence), micrometeorological errors can be of systematic, as well as random, nature (Steinacker et al. 2000).
- Gross errors—The most attention is paid to the so-called large or gross errors. This type of error is characterized by its rare occurrence and its large magnitude, and therefore it does not follow the Gaussian distribution law. Gross errors have strong effects on analyses and forecasts and are caused by the malfunctioning of measurement devices and by mistakes happening during data transmission, reception, and processing. A detailed overview of these errors is given by Gandin (1988).

### b. Common QC methods

Since the beginning of weather analysis and prediction, meteorologists have been analyzing weather charts and simultaneously checking the quality of the synoptic observations. Based on this visual inspection, an observation is retained or rejected. Because of the increasing amount of data and the need for real-time initial fields for NWP, the time-consuming human component of the QC procedure has become an intractable task. Nevertheless, some data centers still consider human inspection to be an important part of the QC procedure (Shafer et al. 2000).

The highest significance can be assigned to the automatic QC. Depending on the existence of additional information, the spatiotemporal distribution of the observation data, and the intended application, an appropriate QC method can be chosen out of a wide variety of different QC procedures. As further information, we consider climate data, background fields based on earlier forecasts, and a priori knowledge, such as error statistics. An adequate QC method also depends on the dimension of the available data involving different levels of redundancy. It is self-evident that observations embedded in a dense spatiotemporal monitoring network allow more complex QC techniques than do single-station time series or isolated atmospheric soundings. According to the intended application, such as the creation of climate databases, model-independent analyses, or the calculation of first-guess fields, some QC methods are inapplicable. For example, an analysis for model validation should not be based on data controlled by a QC algorithm that makes use of a background field depending on the same model (Steinacker et al. 2000).

In the following we try to classify the members of the versatile family of QC methods. First of all, one can distinguish between three different types of outputs. The simpler QC algorithms can only accept or reject an observation whereas more sophisticated ones are able to suggest corrections or give the probability of gross errors. Second, a distinguishing feature can be found in the relationship between the QC algorithm and the analysis or forecast. The great majority of QC techniques are stand-alone applications, while some others are partly or completely incorporated into the analysis or forecast algorithms. A third criterion is the use of previous knowledge, such as statistical limits and background fields based on short-term forecasts. Additionally, some QC procedures consider each single observation separately, whereas enhanced ones take into account data from neighboring stations. This allows us to take advantage of the continuity, a criterion satisfied by most meteorological parameters. In practice, more than only one parameter may be measured at a station site. Therefore, it is obvious to check for internal consistency and to ensure that the data fulfill physical constraints (e.g., the hydrostatic relation), in which a further category of QC methods can be seen.

Below we give an overview of the most established QC methods.

- Limit checks—Regardless of the actual weather situation, there are physical limits for each parameter that can never be exceeded (e.g., negative precipitation values do not exist). WMO (2008) contains detailed lists of physical limits for several parameters. A more enhanced check examines the local daily and annual weather conditions and compares an observation against climatological limits.
- Almost every QC application begins with these plausible value checks, which make it possible to identify gross errors and to discard the observation immediately. Especially when establishing a climatological dataset, this method makes a significant contribution to assure its quality (Feng et al. 2004; Baker 1992).

- Temporal consistency checks—The availability of time-resolved observations allows us to check if the instrument is stuck at a particular reading (Shafer et al. 2000) or if the tendency represents values that are implausible compared to climatological time series. These so-called persistence checks and the step change test are, similar to the limit check, one of the fundamental components of QC processes. Fiebrich and Crawford (2001) describe these tests in detail and give a list of thresholds for maximal allowed steps and minimal required standard deviations for various parameters.
- Internal consistency checks—Normally, more than one meteorological parameter is measured at an observing station at the same time. Some of these parameters are physically related and the internal consistency check tests if values of related parameters are free of contradictions. An example would be checking the dewpoint temperature
*T*and air temperature_{d}*T*for the relation*T*≤_{d}*T*. If time series of one parameter are available, further tests such as*T*_{min}≤*T*_{max}are possible (Reek et al. 1992). A more complex internal consistency check makes use of physical constraints, such as the hydrostatic or the geostrophic relationship, by computing both sides of the relevant equation independently and comparing these results (Baker 1992). - Spatial consistency checks—Considering meteorological phenomena of scales exceeding the one resolved by the observational net, one can expect the related parameters to be distributed smoothly and therefore feature a high degree of autocorrelation. The redundancy of parameters, such as mean sea level pressure or potential temperature, allows us to compare the usually similar values with each other, which helps to detect outliers. The so-called consistency or buddy-check approach calculates the difference (residual) between a measured value and the expected value at the position of the station in question. This expected value is determined by an analysis that takes into account a certain number of adjacent stations, excluding the considered observation. The possible functional relationships between the influencing values and the value to be interpolated results in algorithms of different complexity. The following methods are widely spread:
- – Inverse distance interpolation (ID)—The interpolated value is determined by the sum of the weighted surrounding station values, which are located within a certain radius from the target station. The weighting function is derived from the inverse of the distance between the target and the surrounding station. This quite facile approach is described, for example, by Wade (1987). A further advanced possibility to weight the surrounding stations according to their distances is introduced by Barnes (1964).
- – Polynomial interpolation—Another approach for computing the interpolated value is to find a polynomial of order
*n*that fits the measured values in the surroundings of the observation in question. A polynomial of order zero represents the simplest possible case, which means comparing the value of the target station to those of the surroundings. Nowadays, higher-order polynomial functions or splines (piecewise composed polynomials) are used. To obtain a smoothed field, it is possible to formulate a (cubic) spline interpolation as an optimization problem regarding a minimal roughness or curvature. - – Spatial regression (SR)—Instead of using only a weighting function depending on the distance, a more sophisticated approach is to assign the weights according to the correlation between the station of interest and the neighboring stations. These weights are based on the root-mean-square-error (RMSE) between the previous observations at the target and at the surrounding stations. Linear regression between the target station and each of the surrounding stations is performed, and results in an estimated value for the station of interest. Together with the correlation, a confidence interval for the observed value can be defined. If the observed value lies outside the boundaries of this interval, it is suspected to be erroneous. In a case study concerning
*T*_{min}and*T*_{max}, carried out by Hubbard and You (2005), SR proved to be superior to ID. The SR has the advantage of not automatically assigning the highest weight to the nearest station, for example, when considering a coastal station that is more comparable to another coastal station farther away than to a mountain station in close proximity.- All these methods have in common that their results allow us to compute residuals that measure the quality of the tested station value. Depending on the residuals, this value can be accepted, corrected, or dismissed.

- Optimum interpolation (OI)—As in the previously mentioned spatial consistency checks, the result of OI consists of an estimated value, to which the observation is compared. Additionally, SR and OI have the use of statistical information in common. The OI requests the computation of a background field and two error covariance matrices based on observational and background data. A significant difference between OI and other spatial consistency checks is the possibility of analyzing isolated stations. The influence of the surrounding stations is limited to, but not required for, the computation of the background field. A modified version of OI was used by Lorenc (1981) to check data quality.
- Bayesian quality control (BQC)—In contrast to all previously described methods, BQC gives the probability of an observation to be a gross error. This method is based on Bayes’s theorem, a mathematical formalism that allows the computation of the gross error probability. Therefore, observation and background values and their error distributions as well as the a priori estimate of the gross error probability, are required. This formalism is implemented in two ways; either the posterior probability for gross errors is calculated for each observation or it is computed simultaneously for a combination of stations. A detailed description of different approaches of BQC can be found in Lorenc and Hammon (1988) or in Ingleby and Lorenc (1993). The obvious advantage of this method is in the possibility to compute the probability that an observation is affected by a gross error, which represents a natural criterion to reject or accept an observed value.
- Variational quality control (VarQC)—Leading operational numerical weather prediction centers are using a variational approach [e.g., four-dimensional variational data assimilation (4D-Var)] for the analysis and forecast of atmospheric parameters. The variational approach is based on the minimization of a cost function that is basically composed of the observation and background fields and their error covariances. This variational approach provides the possibility of incorporating the quality control procedure within the analysis itself by weighting the observation term in the cost function according to the gross error probability. It is evident that one should apply the Bayesian theorem for determining this probability as it is implemented at the European Centre for Medium-Range Weather Forecasts (ECMWF) (Andersson and Järvinen 1999).
- Complex quality control (CQC)—Most of the described QC mechanisms do not exclude each other, which offers us the possibility to carry out some of these processes consecutively. As a consequence, there are several residuals proposing to correct, reject, or retain a flagged observation. These flags and residuals have to be combined with a unique proposal, which is carried out by a so-called decision making algorithm (DMA). The successive application of the QC components and of the DMA constitutes the CQC (Gandin 1988).

### c. Classification of VERA-QC compared to the previously presented methods

The quality control procedure presented in this paper combines elements of some of the previously mentioned methods; moreover, it adds completely new components. It is needless to point out that simple controls, such as limit checks, climatological checks, and single-station internal and temporal consistency checks, are applied. Observations passing these tests are checked for their spatial and, if required, also for their spatiotemporal consistency, which is considered to be the main focus of the presented QC procedure. The spatial consistency is checked by a variational approach that minimizes the curvature of the analyzed field. The VERA-QC method can be considered partially to be a complex QC method by (i) recognizing gross errors in a first iteration, (ii) flagging them, (iii) excluding stations with these errors, and (iv) repeating the procedure.

## 3. Methodology

This section describes the spatial consistency check that is carried out after data pass the simple QC checks mentioned above. Assuming an observation network with a sufficiently high density, meteorological parameters can be considered to be smoothly distributed. The precipitation field caused by a rain shower, for example, becomes coherent and smooth if the spacing of an observational network is much below the extent of the rain shower and the temporal resolution is much higher than its duration. Hence, data of conventional synoptic networks do not allow a QC check concerning convective precipitation; however, they do allow the QC verification of, for example, extratropical pressure systems. Naturally, measurements are erroneous and lead in general to an observation field that is rougher than the idealized error-free field. The roughness or smoothness of these fields can mathematically be expressed in terms of a cost function that consists of the integral of the squared curvature over the controlled domain. Minimizing this cost function and considering certain constraints lead to an optimization problem that is solved by a variational approach. As a result, we obtain deviations that are proposals to correct the measurements. By applying these deviations to the observed values, a field meeting the requirement of a minimal curvature is received. The following presents all the steps, such as the definition of the cost function and the discretization of the curvature and its derivations, as well as the solution of the resulting matrix equation.

### a. Cost function of the variational approach

*Ψ*, we define the error-free analysis field

*Ψ*

*and its curvature*

_{a}*Ψ*. Furthermore, we define the cost function

_{o}*J*as the sum of the squared curvature at all

*N*(

*G*) grid points of the discretized analysis field as follows:

*J*takes a minimal value if the curvature of the analysis field is minimal as well. As explained in more detail below, the analysis field is approximated in terms of the known observation field. The squared curvature

*x*,

*y*) example, this can be written asA compact and more general version of Eq. (2) for

*D*dimensions is represented bywhere

*d*

_{1}and

*d*

_{2}stand for the spatial coordinates or the time (e.g., in four dimensions:

*d*

_{1}=

*x*,

*y*,

*z*,

*t*;

*d*

_{2}=

*x*,

*y*,

*z*,

*t*). Because the analysis field and its curvature at any point

*n*are unknown, they can be approximated by a first-order Taylor series around the curvature of the observed field

*Ψ*:The subscript

_{o}*E*denotes not only the station

*n*but also neighboring stations that are allowed to be erroneous. It should be pointed out that this is a special feature in contrast to the above-described consistency checks where it is common that only the station in question is considered to be erroneous. In Eq. (4), the only unknown variables are

*Ψ*and the so-called deviations (

_{a}*Ψ*−

_{a}*Ψ*) = Δ

_{o}*Ψ*. To compute these deviations, one has to combine Eqs. (1) and (4), to differentiate the cost function

*J*with respect to all deviations Δ

*Ψ*

*, and to solve the resulting equation system for these deviations.*

_{n}### b. Finding natural neighbors

To declare neighboring stations that are allowed to have potential errors [subscript *E* in Eq. (4)], we have to define three terms:

- Main station
—The cost function in Eq. (1) consists of as many terms as stations exist in the considered domain. In this domain one station after another is always regarded as the center of the local neighborhood and is called the main station. As an example, station in Fig. 1, marked by the pentagram was selected to be the local main station. - Primary stations
—All next natural neighbors of an actual main station and the main station itself are denoted as primary stations. Stations 2, 3, 4, 7, and 9 in Fig. 1 correspond to . - Secondary stations
—All next natural neighbors of the *p*th primary stationsand the primary station itself are named secondary stations. Particularly, denotes the whole subset consisting of a main station, all its next nearest stations, and furthermore their adjacent stations. In Fig. 1, all stations of the domain excluding station 8 are secondary neighbors.

The distinction between primary and secondary neighborhoods is necessary, because their stations are influencing the cost function *J* in two different ways. Stations in the primary neighborhood are located next to the main station and their values are allowed to vary. Mathematically, this is expressed in the Taylor series expansion of Eq. (4) where the primary stations are denoted by the subscript *E*. Stations in the wider secondary neighborhood are only used to compute the *n* = 1, 2, … , *N* curvature terms in the same equation.

An appropriate method of finding natural neighbors is the so-called Delaunay triangulation (Barber et al. 1996). The principle of this method in two dimensions is to connect three points at a time in such a way that no other point can be found in the circumcircle of the so-composed triangle. This concept can be expanded easily to three or more dimensions by replacing triangles with tetrahedrons and circumcircles with circumspheres and higher-dimensional analogs.

In comparison to Fig. 1, which shows the simple case of a homogeneous station distribution, Fig. 2 illustrates the stations of the European surface synoptic observation (SYNOP) network that reported on 1200 UTC 29 August 2009. One can observe national distinctions concerning the density of available measurements and that sometimes very distant or very close stations are connected, which requires a special treatment as described in section 3g. At this point an advantage of the presented method becomes obvious: whereas some spatial consistency checks such as the ID demand the definition of a radius of influence around the actual main station, VERA-QC offers a natural way of selecting influencing neighbors according to the local station density. This is done with the help of the Delaunay triangulation. Nevertheless it is reasonable to define an upper limit for the allowed distance between neighboring stations. To avoid fixed thresholds and to take into account the local station density, the upper limit is defined in terms of a multiple of the mean distances in the considered subdomain.

It should be mentioned that before the triangulation procedure is carried out, stations with an obviously low correlation to the surroundings (e.g., a mountain station surrounded by valley stations) are excluded from the QC procedure and any further analysis. This approach is comparable to the conventional procedure of excluding mountain stations when analyzing sea level pressure fields.

### c. Specification of the cost function

*m*counts from 1 to

*d*

_{1}and

*d*

_{2}count from 1 to

*D*; furthermore,

*Ψ*

*that are minimizing the cost function*

_{n}*J*, which requires differentiating

*J*with respect to all

### d. Solution of the optimization problem

*i*and

*j*as row and column indices, each ranging from 1 to

*A*

_{i}_{,j}and the right-hand side

*b*of the matrix Eq. (7) can be expressed aswhere

_{i}*F*

_{i}_{,j}is a flag matrix with

*F*

_{i}_{,j}= 1 if stations

*i*and

*j*are natural primary neighbors and

*F*

_{i}_{,j}= 0 otherwise.

Considering a real observation network such as the one shown in Fig. 2, the number of equations in the linear but coupled system of Eq. (6) primarily exceeds a limit of 1000. The numerical solution of such a system of equations demands high computational power. By using the concept of sparse matrices, whose elements are predominantly zeros, the solution of the large system of Eq. (6) with

### e. Discretization of the curvature and its derivations

Mathematically, the curvatures and their derivations are defined at all points of the *D*-dimensional domain. In the course of the discretization one has to select a finite number of homogeneously distributed points at which these variables are evaluated. Usually station positions are not distributed regularly and a homogeneous grid has to be defined. To reduce the number of grid points (otherwise computationally expensive) and to adjust the gridpoint density to the inhomogeneous station distribution, subsets of required grid points are placed around the individual stations as shown in Fig. 3.

*Ψ*denotes the unknown field values at all

_{n}*n*grid points in the secondary neighborhood around the actual main station, and

*Ψ*are the observed values in the same subdomain. The distances between the positions of points

_{s}*n*and

*s*are abbreviated by

*d*

_{n}_{,s};

*α*and

*β*are parameters that control the impacts of more distant observations and the degree of smoothing, respectively. This interpolation is carried out for all

*y*direction evaluated for an arbitrary station

*s*could be expressed aswhere (

*x*,

_{s}*y*) denotes the coordinates of the station

_{s}*s*and Δ

*y*is the distance between two adjacent local grid points in the

*y*direction.

*Ψ*:Both terms on the right-hand side of Eq. (12) differ by the increment Δ

_{p}*Ψ*that is added to the station value

_{p}*Ψ*in the argument list of the first term. For practical execution, this increment is added to the station value during the interpolation [Eq. (10)] to the grid points that are needed to compute the curvatures.

_{p}By applying these concepts to the observations *Ψ _{o}* and inserting these discretized derivations into Eqs. (8) and (9), it is possible to solve Eq. (7) by matrix inversion.

### f. Distinguishing different deviations

*Ψ*=

*Ψ*−

_{a}*Ψ*. At this point the decision has to be made if an observation is accepted, corrected by the deviation, or dismissed. This decision is made for one station after another and depends not only on the value of the deviation itself, but also on its impact in correcting the observation. This can be expressed as the degree of the reduction of the cost function if the deviation would have been applied to the observed value. The reduction of the cost function

_{o}Considering the idealized example of a domain with only one erroneous station (see Fig. 4a), significant deviations are calculated not only for the erroneous station, but also for its neighbors (Fig. 4b). By computing the cost function reductions *Ψ*. As shown in Fig. 4c, this modification of the deviations avoids an error propagation to stations that are considered to be error free.

The weighted deviations,

- Gross error—The observation is assumed to have a gross error if the cost function reduction exceeds a user-defined threshold and if the weighted deviation exceeds, at the same time, a user-defined multiple of the median of all the weighted deviations. The latter condition avoids the case where very small deviations, although reducing the local curvature significantly, are identified as outliers. Stations with gross errors are excluded from further considerations by this decision-making algorithm.
- No gross error—If only one of the above-mentioned two criteria is not fulfilled, the observation is retained and the following two cases for handling the deviations are possible.

- The weighted deviation is applied—If the weighted deviation exceeds a user-defined absolute threshold, it is applied.
- The weighted deviation is not applied—In the opposite case when the threshold is not reached, the error is regarded as being randomly distributed and the observation is accepted without corrections.

The mentioned thresholds are set by experience and may be changed depending on user-defined requirements similar to the choice of a numerical filter technique.

As soon as a gross error is detected, the whole QC procedure is repeated after discarding the gross-error-affected observations. Otherwise, the former neighbors of outliers would maintain their misleadingly large deviations. The artificially induced error in the example shown in Fig. 4 would be identified as a gross error and has no further influence on the computation of the analyzed field.

It should be mentioned that the station density and the station distribution, as well as the chosen interpolation method, have an effect on the interpolated absolute value at the grid points around each station (cf. Fig. 3) and, as a consequence, also on the curvature. Nevertheless, the VERA-QC concept depends only on the relative change in curvature reduction, which is not very sensitive to the chosen interpolation method.

### g. Consideration of clustered stations

In general, the weighting with the cost function reduction offers a good method for identifying erroneous stations. Still, there are some special constellations of station alignments where an error from one station is propagated to another close-by station. This problem occurs if the distance between two or more stations is much smaller than the average distance between all stations in the considered subdomain. In this case, the curvature is not only minimized by correcting the erroneous station partially, but also by adding deviations of opposite sign and comparable magnitude to the neighboring station(s). Although mathematically comprehensible, this procedure does not lead to the desired result.

With the help of an idealized example, supported by Figs. 5 and 6, as well as Table 1, the problem of so-called clustered stations and its solution are described.

Observation values (obs), deviations, and final results of the QC procedure with (row 8) and without (row 2) cluster treatment for an idealized example with 20 regularly distributed stations, as shown in Fig. 5. See text for further explanation.

If the distance between neighboring stations falls below a certain percentage of the median of all station distances in the considered subdomain, these affected stations are combined to one fictive cluster station. In Fig. 5, stations 1a and 1b are recognized to be clustered stations and, as a consequence, are combined to the fictive cluster station 1. Note that for better visibility the displayed distance between these two stations has been increased.

Suppose a flat observation field where all observations exhibit the constant value zero except for station 1b, which is affected by an artificial error of the magnitude 1 (see Table 1, row 1). This observation field is presented in Fig. 6a. The application of the QC procedure without the special cluster treatment would lead to opposite deviations for stations 1a and 1b that are comparable in their absolute values (Table 1, row 2). As one can see in Fig. 6b, this special constellation would result in a reduction of the error of only approximately 50% and in adding a significant virtual error to the neighboring station 1a, which was assumed to be error free.

To handle this problem the following steps are carried out:

- After identifying the cluster members, these stations are replaced by a virtual station, whose coordinates are computed by taking the mean of the original station’s coordinates. The station value of the fictive cluster station is derived by taking a weighted mean of the single cluster member values. Thereby, the inverse of the number of primary natural neighbors serves as weighting factor. The modified observation field is shown in Fig. 6c and Table 1 (row 3).
- The VERA-QC procedure is applied to the new station distribution and as a result the weighted deviations are computed (Table 1, row 4).
- These weighted deviations for the cluster stations are transferred to the member stations (Table 1, row 5) and are applied to their observation values.
- In the next step, the QC procedure is repeated for all stations with the former cluster members, modified as described in the previous point. The resulting deviations are listed in Table 1 (row 6).
- Finally, the weighted deviations of both QC iterations are accumulated. All uninvolved stations receive the weighted deviations from the second iteration (Table 1, row 7). Note that a weighted deviation is only applied if the considered station value is not detected as a gross error and the threshold described in section 3f is exceeded. Comparing the results of the QC process with and without clustering (Figs. 6b and 6d, respectively; Table 1, rows 2 and 8), one can see the positive effects of cluster treatment.

Metaphorically speaking, the weighting method described in the first point has the consequence that errors are propagated to cluster members that are better embedded in the station network. The higher number of primary neighbors enables an error to be detected and reduced more easily.

## 4. Examples

On the basis of some selected analytical examples, the properties and specific features of the described VERA-QC are presented in this section. Starting with the simplest case of a one-dimensional station distribution, we consider at first observations with one central outlier, followed by three central stations with equal values differing from the rest, and finally an example with two separated outliers is given. The last example mentioned is compared to an analog with a two and another with a three-dimensional station distribution, which are also affected by two errors in order to show the positive effects of the increased number of neighbors in a higher-dimensional observation network. This section is concluded with a quite complex example featuring gross and random errors as well as clustered stations. The application of this QC procedure to realistic observation fields is beyond the scope of this paper and we plan to present an outline in a following paper with a focus on operational applications and case studies.

### a. One dimension with one central outlier

Referring to the two-dimensional example visualized in Fig. 4 with one central outlier, Fig. 7 presents the one-dimensional equivalent. All station values of the 15 stations (black dots) are equal except for the centered station value, which has been given an artificial error of magnitude one. Apart from the original station values, also the two results that consist of the observations corrected by the unweighted (light gray diamonds) and weighted (dark gray squares) deviations are shown in Fig. 7. In comparison to the two-dimensional example shown in Fig. 4, there are some similarities but also a significant difference. The application of the unweighted deviations leads, in both cases, to the effect of counter-swinging in the surroundings of the erroneous station, which can be avoided by weighting the deviations with the reduction of the cost function. Whereas in one dimension a station can have at most two next nearest neighbors, in higher dimensions this number is generally increased. As a consequence, an error can be detected and corrected more easily, and the influence on the surrounding stations is reduced considerably. Analyzing the difference between these two case studies, one can see that in the two-dimensional example that the error is reduced to approximately a third of the error in the one-dimensional case.

### b. One dimension with centered signal

The evidence that the VERA-QC method is not just a smoothing algorithm can be understood considering the following example presented in Fig. 8. Contrary to the previous example, the values of the three centered stations are beyond the range of the others. Those values should be interpreted as a signal, rather than a group of outliers. On the one hand, it is unlikely that these three neighbors are affected by gross errors and on the other hand, this QC method is based on the assumption that gross errors are rare, as pointed out in section 2a. Applying the unweighted deviations would lead to a more or less smoothed analysis field where the observation values are reduced by approximately 50% and all other station values are affected as well. This undesired effect of averaging the signal can be avoided by applying the weighted deviations. As a result, the magnitude of the signal is maintained, and only the sharp contrast between the three central stations and the surroundings is softened slightly. Moreover, one can see that the mean value is not preserved, which presents a further special property of VERA-QC.

### c. One, two, and three dimensions, each with two errors

The effect that in higher dimensions errors are detected more easily and therefore corrected to a higher degree is presented by analyzing three comparable station distributions in one, two, and three dimensions. Moreover, it can be observed that the troublesome effect of counter-swinging in the surroundings of outliers is reduced by increasing the number of spatial dimensions and/or including the temporal dimension. With the help of three symmetrical station distributions, differing in the number of dimensions (arranged in Figs. 9a–c from top to bottom on the left-hand side), these two effects are visualized by comparing the corrected observations on the right-hand side of the same figure. These three examples have in common that in each case a centered station (black dot) surrounded by two layers of stations (inner layer, dark gray dots; outer layer, light gray dots) exist. Two artificial errors (black pentagrams) of magnitude 1 are impressed onto a station of the first and onto one of the second layer. The corresponding bar plots on the right-hand side present the observations (white bars with black edges), as well as the corrected observations, based on the unweighted and weighted deviations (light and dark gray bars). Apart from the two above-mentioned effects (note that both errors and counter-swinging are reduced to approximately by half by adding a further dimension), the influence of the erroneous station’s position with respect to the boundaries of the domain is illustrated. This can be seen by comparing the bars corresponding to erroneous stations (observation value 1). The left examples in each panel present the stations located in the inner layer, displaying a higher detection and correction rate for the errors, and the right bars show those of the outer layer with accordingly lower corrections. The reason for the unequal efficiency can be found in the different numbers of natural neighbors which, as a matter of course, are lower at the boundary of a domain.

### d. Real two-dimensional station distribution with artificial observations

A real mesoscale station distribution (black dots) in the area surrounding Vienna, Austria (white lines represent borders), with some clustered stations (white squares) and one gross error (pentagram), is shown in Fig. 10a. The observations represent an artificial mean sea level pressure field that is composed by a southwest-to-northeast gradient. It could also feature a potential or equivalent potential temperature field. Figure 10b shows the interpolation of these station values onto a regular grid. To simulate realistic observations, Gaussian random errors with a variance of 1 hPa and one gross error were added to the reference field. Figure 10c illustrates the pressure field with random errors and Fig. 10d the simulated observation field (with additional gross error) on which the VERA-QC was applied. The observations have been interpolated onto a regular grid with the help of VERA, a high-resolution analysis scheme based on the thin plate spline method. The given analytic pressure field is not only disturbed by random errors and by a negative gross error in Vienna as mentioned above, but also by another outlier in the southwest of the domain that is part of a cluster. One may ask why the gross error in Vienna is not part of a cluster even though the distances between the adjacent stations are quite short. As mentioned in section 3g, cluster recognition depends on the station distribution in the local subdomain. To correct the observations, the following steps in the VERA-QC process, described in detail in section 3, are carried out. After the first iteration with cluster treatment and gross error recognition, the gross error stations are rejected and in the second iteration the weighted deviations are computed and applied to the observed values. The resulting field can be seen in Fig. 10e, where in the greater area of Vienna the defined analytic gradient in the pressure field is restored and is only influenced by the random fluctuations. Moreover, the outlier of the mentioned cluster station is corrected to a great part without neglecting the affected station. This result can also be taken from Fig. 10f, where the field of the weighted deviations (as the difference between the fields shown in Figs. 10d and 10e) is presented. Except for the two error-affected stations, all other stations require only minor corrections. Their application depends on the user-defined thresholds as described in section 3f. This complex example demonstrates the advantage of VERA-QC of not smoothing outliers and their surroundings but rather maintaining the resolvable patterns of the observation field.

It should be mentioned that VERA-QC is written in Matlab and is able to run on a Linux server as well as on a Windows PC. The QC procedure (including data reading and writing) of one parameter for a domain containing approximately 1000 stations takes about 10 s. Therefore, it is an appropriate preprocessing tool for every kind of further analysis.

## 5. VERA-QC in comparison with two common QC methods

In section 2, an attempt was made to find a classification for the VERA-QC procedure based on different and fundamental approaches of QC methods widely used. In the following, the performance of VERA-QC is compared to the performances of two commonly used spatial consistency checks, using inverse distance (ID) and spatial regression (SR) interpolation. This is done by applying the QC processes to a series of artificial observation fields with seeded errors that were designed to be as realistic as possible. In contrast to real fields, artificial observations have the advantage of knowing the “truth,” as well as the random and gross errors.

### a. Observation field

To simulate an observation field presenting all possible realistic difficulties, such as different station densities, alpine and coastal influences, and the lack of smoothness, a domain including parts of the Mediterranean Sea and most of the Alps was selected. The positions of 332 stations were taken from the World Meteorological Organization (WMO) Global Telecommunications System (GTS) station list and divided into three types based on their location, namely whether they were coastal, lowland, or alpine stations.

As a meteorological parameter, the mean sea level pressure was chosen. It is simulated by a composition of three two-dimensional wave patterns [wave lengths *λ* = (400, 600, 800) km, amplitudes varying randomly between

The values of the observation field are interpolated to the station positions and Gaussian-distributed random errors (mean *μ* = 0 hPa; standard deviation *σ* = ⅓ hPa) are added. Optionally, gross errors with a mean value of *μ* = 15 hPa, a standard deviation of *σ* = 2 hPa, and a random sign can be included.

### b. Methods

Since the spatial consistency checks using ID and SR are straightforward, these QC methods were chosen for the comparison. Their formulation is well known and can be found, for example, in Hubbard and You (2005), who provide a detailed formalism that was followed by implementing the two methods. The required settings were optimized for the simulated observation fields and for the given station distribution:

- For both ID and SR the influence radius
*r*was set to*r*= 100 km in order to enable the allocation of neighbors in less dense regions in the south of the domain. - The minimally required coefficient of determination
*R*^{2}to select influencing stations for the SR was optimized to handle gross errors and was assigned a value of*R*^{2}= 0.5. - Concerning SR, the recognition of gross errors also requires the definition of a confidence interval. The parameter
*f*as a multiple of the weighted standard error of the estimate controls the width of this interval and regulates the strictness regarding gross errors. An ideal value was found to be*f*= 8.4.

In VERA-QC the minimal cost function reduction

### c. Results

The performance of the three different QC methods is evaluated regarding two criteria, namely the recognition of random errors and the detection of gross errors.

#### 1) Recognition of random errors

As a measure for comparing the performance levels of the three QC methods, the difference between the added artificial random errors and the deviations suggested by the QC procedures for a simulated observation field free of gross errors is used. First, 100 time steps were simulated in order to compute the required correlations between the station values for the QC option using SR interpolation. After that, another 100 simulations were carried out to collect the differences between the proposed deviations and the added random errors. The sorted and cumulated differences for the three methods are illustrated in Fig. 12. One can see that VERA-QC, in comparison to the two other methods, generally produces smaller differences between the (known) artificial random errors and the deviations (computed by the QCs). Regarding this criterion, VERA-QC is preferable. The more enhanced QC choice using SR interpolation delivers, as expected, somewhat better results than that using ID interpolation.

Additionally, the statistical measures root-mean-square error (RMSE) and mean absolute error (MAE) are computed for the above-mentioned differences. In Table 2, their values are summarized for all stations together and also separately for the three station types (coastal, lowlands, and alpine). The RMSE and MAE values confirm that VERA-QC generates the smallest errors and a QC process using SR is superior to one using ID. The artificial observation field around alpine stations features the highest variability. Thus, one might expect that a QC procedure is less efficient in recognizing errors in this mountainous region. Nevertheless, the high station density in the alpine area compensates for the difficulty caused by the local high variability. Naturally, the performance of a QC method using ID or SR degrades in coastal areas. This is due to the extremely inhomogeneous station distribution that is characteristic of these regions and to a constant radius of influence when choosing neighboring stations. In contrast to QCs using ID or SR, the VERA-QC can adapt automatically to varying station densities and can treat coastal stations as well as others embedded in a more homogeneous station distribution. This aspect is reflected in the hardly varying values of the statistical measures for coastal, lowland, and alpine stations.

Combined presentation of the RMSE and MAE for the three different QC methods. The given numerical values refer to all stations (all), as well as to station subsets (coastal, lowland, and alpine). Both statistical measures are based on the differences between added artificial random errors and suggested deviations. See the text for further explanations.

#### 2) Detection of gross errors

The performance of a QC procedure can also be evaluated considering its ability to recognize gross errors. This ability is quantified with the help of contingency tables and skill scores. These evaluations are carried out for the VERA-QC and the QC methods based on SR because both offer an advanced criterion for the recognition of gross errors.

Figure 13a shows the alignment of the selected contingency table, which is taken from Wilks (1995) and Jolliffe and Stephenson (2003). The definitions of the skill scores used, the equitable threat score (ETS) and the Heidke skill score (HSS), are also described in these books.

The appearance of a gross error is, compared to the high quantity of observations, considered to be a rare event. Due to this fact, the ETS that is especially designed for the verification of such rare events has been chosen as evaluation parameter in the contingency table. Additionally, the values of the equally adequate but probably better known HSS have been computed.

As before, a time series of 100 simulated observation fields was generated but with the difference that 2% of all stations were affected by gross errors. The results of these simulations are summarized in Figs. 13b and 13c. As the achieved values of the statistical measures ETS and HSS are close to one, the performance levels of both QC methods are found to be very convincing. Note that the number of observations identified as false alarms by VERA-QC is slightly smaller. This implies that fewer measurements are rejected by mistake and the information offered by these stations stays available, which is especially important in regions with a less dense station distribution.

In contrast to a QC process based on SR, VERA-QC does not require any a priori knowledge, such as correlations between the observations. From our point of view this presents a considerable advantage.

## 6. Conclusions and outlook

In this article a new QC method based on self-consistency, called VERA-QC, has been outlined and compared to common QC procedures. As demonstrated, VERA-QC combines advantages that can be found in its model independency and in its ability to check the spatial and temporal consistency at the same time. Moreover, it is applicable to large domains; first, by using the computationally inexpensive concept of sparse matrices and, second, by the fact that a large and unknown number of iterations, required by some other QC methods, is replaced by at most one additional repetition of the QC algorithm. VERA-QC also automatically adapts to different densities of observation networks by using the concept of natural neighborhoods and by considering physically implied covariances. Therefore, it is appropriate to control the data acquired by field studies covering microscale phenomena, as well as to control GTS data from a lower resolved observation network of a whole continent. Compared to two other QC schemes, VERA-QC has shown a higher degree of efficiency in detecting erroneous values.

Although the presented VERA-QC method has been delivering an optimal level of performance as a preprocessing tool of the hourly VERA analyses, there are still some planned improvements. In the real-time application of VERA-QC, the unweighted and weighted deviations are stored and that offers the possibility of statistically evaluating them. VERA-QC is intended to check for the representativeness of stations and to detect biased stations. As a logical step, a bias correction of real data can be introduced. Another possible way to improve the VERA-QC procedure is to extend the method toward a multivariate approach where wind and pressure gradients could be treated simultaneously. This would concern the mathematical core of the VERA-QC procedure and therefore the cost function would have to be formulated in a different way including physical constraints.

Complementing the herein-presented basic principles of the VERA-QC approach, a further publication is in preparation. The results of the operational implementation and the outcomes of its applications in field studies will be discussed.

Thanks are due to the Austrian Science Fund (Fonds zur Förderung der wissenschaftlichen Forschung, FWF; P19658) and to the Austrian Research Funding Association (Die Österreichische Forschungsförderungsgesellschaft, FFG; project 818110) for partial financial support of this work.

## REFERENCES

Andersson, E., , and H. Järvinen, 1999: Variational quality control.

,*Quart. J. Roy. Meteor. Soc.***125**, 697–722.Baker, N. L., 1992: Quality control for the navy operational atmospheric database.

,*Wea. Forecasting***7**, 250–261.Barber, C. B., , D. P. Dobkin, , and H. Huhdanpaa, 1996: The Quickhull algorithm for convex hulls.

,*ACM Trans. Math. Software***22**, 469–483.Barnes, S. L., 1964: A technique for maximizing details in numerical weather map analysis.

,*J. Appl. Meteor.***3**, 396–409.Feng, S., , Q. Hu, , and W. Qian, 2004: Quality control of daily meteorological data in China, 1951–2000: A new dataset.

,*Int. J. Climatol.***24**, 853–870.Fiebrich, C. A., , and K. C. Crawford, 2001: The impact of unique meteorological phenomena detected by the Oklahoma mesonet and ARS micronet on automated quality control.

,*Bull. Amer. Meteor. Soc.***82**, 2173–2187.Gandin, L. S., 1988: Complex quality control of meteorological observations.

,*Mon. Wea. Rev.***116**, 1137–1156.Hubbard, K. G., , and J. You, 2005: Sensitivity analysis of quality assurance using the spatial regression approach—A case study of the maximum/minimum air temperature.

,*J. Atmos. Oceanic Technol.***22**, 1520–1530.Ingleby, N. B., , and A. C. Lorenc, 1993: Bayesian quality control using multivariate normal distributions.

,*Quart. J. Roy. Meteor. Soc.***119**, 1195–1225.Jolliffe, I., , and D. Stephenson, 2003:

*Forecast Verification: A Practitioner’s Guide in Atmospheric Science*. John Wiley and Sons, 240 pp.Lorenc, A. C., 1981: A global three-dimensional multivariate statistical interpolation scheme.

,*Mon. Wea. Rev.***109**, 701–721.Lorenc, A. C., , and O. Hammon, 1988: Objective quality control of observations using Bayesian methods: Theory, and a practical implementation.

,*Quart. J. Roy. Meteor. Soc.***114**, 515–543.Pöttschacher, W., , R. Steinacker, , and M. Dorninger, 1996: VERA - a high resolution analysis scheme for the atmosphere over complex terrain.

*MAP Newsletter,*Vol. 5, Mesoscale Alpine Programme Office, Zurich, Switzerland, 64–65.Reek, T., , S. R. Doty, , and T. W. Owen, 1992: A deterministic approach to the validation of historical daily temperature and precipitation data from the cooperative network.

,*Bull. Amer. Meteor. Soc.***73**, 753–762.Shafer, M. A., , C. A. Fiebrich, , D. S. Arndt, , S. E. Fredrickson, , and T. W. Hughes, 2000: Quality assurance procedures in the Oklahoma mesonetwork.

,*J. Atmos. Oceanic Technol.***17**, 474–494.Steinacker, R., , C. Häberli, , and W. Pöttschacher, 2000: A transparent method for the analysis and quality evaluation of irregularly distributed and noisy observational data.

,*Mon. Wea. Rev.***128**, 2303–2316.Steinacker, R., and Coauthors, 2006: A mesoscale data analysis and downscaling method over complex terrain.

,*Mon. Wea. Rev.***134**, 2758–2771.Wade, C. G., 1987: A quality control program for surface mesometeorological data.

,*J. Atmos. Oceanic Technol.***4**, 435–453.Wilks, D., 1995:

*Statistical Methods in the Atmospheric Sciences*. Academic Press, 467 pp.WMO, 2008:

*Guide to Meteorological Instruments and Methods of Observation*. 7th ed. WMO-8, World Meteorological Organization, Geneva, Switzerland, 681 pp.