Advancing Near-Real-Time Quality Controls of Meteorological Observations

,

A P R I L 2 0 2 2 E1079 M eteorological data from ground weather stations represent an essential source of information for modeling, monitoring, and research studies.These data are assimilated every day for numerical weather predictions (e.g., Reich and Stuart 2015;Navon 2009) and are used in monitoring systems such as the European drought observatory, the European Flood Awareness system, and the European crop yield monitoring and forecasting system (MARS).Meteorological observations are key to understand and characterize past changes (e.g., van den Besselaar et al. 2013;Toreti and Desiato 2008), to reconstruct climate of the last centuries (e.g., Luterbacher et al. 2016), and to evaluate and support climate model simulations (e.g., Toreti and Naveau 2015).However, the use of meteorological data heavily relies on their quality.Erroneous values can lead to inaccurate analysis, assessments, and evaluations.The effects of erroneous values can propagate and be amplified when meteorological data are interpolated on regular grids and used to feed models and monitoring systems.Detecting, and whenever possible correcting, these values is not an easy task and requires procedures having different levels of complexity (Durre et al. 2010;Fiebrich et al. 2010).Here, we describe a new near-real-time quality control system developed and implemented as an open-source R-based software: QuackMe (Quality Checks of Meteorological Data).The system has been initially developed to obtain high-quality daily meteorological information for the MARS crop yield forecasting system and the European Drought Observatory of the European Commission.It is freely available and designed in such a way to be widely usable by the scientific community and by practitioners.The development of QuackMe started in late 2018 to address the need of replacing an old FORTRAN-based system (in place since the 1990s) with a new one being dynamic, flexible, and more advanced in its controls.The beta version was completed at the end of 2019, and then an intense testing phase took place.Finally, QuackMe entered into force in 2020.
Before describing the details of the new quality control system, in the next section we introduce the design of QuackMe to ease process understanding and how the checks are applied.

Design
QuackMe is not only a quality control system, but it is also designed to derive daily variables from subdaily meteorological observations coming from ground weather stations as well as derive additional meteorological indicators.QuackMe has a simple and intuitive structure based on several input and intermediate data (Fig. 1).Its engine has four main components: converter, check modules, aggregator, and offline threshold calculator.Besides these components, QuackMe is equipped with visualization and validation modules (Fig. 1).
The converter performs preliminary data formatting tasks, and it has been developed to enhance the flexibility of QuackMe.Meteorological networks may use different data formats and codes, or they may change over time.Thus, QuackMe has its own internal data format, and the converter takes care of reading input data files and converting them according to this format.In this way, all the other components of QuackMe are format independent.Currently, the converter handles a few common data formats (e.g., BUFR decoded); therefore, it must be AFFILIATIONS: Toreti, Zampieri, and Ceglar-European Commission, Joint Research Centre, Ispra, Italy; Bratu-Fincons Group, Vimercate, Italy; Müller-DTN, and MeteoIQ, Berlin, Germany; van der Grijn and Oliveira-DTN, Utrecht, Netherlands modified by users when other data formats need to be processed.This step is relatively easy and only requires basic R knowledge.The converter also produces the final output files with quality-controlled daily meteorological parameters, currently in a simple text format designed for its use within the European MARS crop monitoring and forecasting system.
The aggregator derives daily meteorological parameters from subdaily observations.It has specific aggregation rules for each parameter and zone of the world, although at the moment only Europe and China are implemented.To ensure flexibility, all these rules are not encoded in the software but are written in dedicated XML files that can be easily modified by users.Currently, the aggregation rules try to maximize the availability of data, but different approaches can be specified to enhance the temporal coherency and homogeneity.For instance, to derive daily maximum temperature QuackMe currently looks for the specific field usually available in the meteorological station records, and if missing it uses hourly values.
The check modules are the core of the software as they perform all data quality controls.Available checks are grouped into three categories: weak, daily, and threshold based (for more details, see the next section).The principles behind these controls have been extensively discussed in Fiebrich et al. (2010, and references therein).The weak checks are executed immediately after the converter, while daily and threshold-based checks run after the aggregator.To make QuackMe highly customizable, all controls can be easily modified thanks to the user-friendly R-based implementation.Error messages can be customized into dedicated XML files.Parameter-based groups of checks can also be deactivated in the XML files.
Values used in the threshold-based controls are derived by the offline threshold calculator.This component works in offline mode to make the operational use of QuackMe faster.It derives threshold values by inferring distributional properties of the main parameters on historical collected data.More details are provided in the following section.
The observational data flagged as suspicious or wrong by the check modules can be visualized in an interactive front-end and evaluated by meteorologists.All the affected stations, the observational dates, and meteorological data are listed together with the associated error messages (Fig. 1).In addition, the affected stations are visualized in an interactive map.Meteorologists can accept, modify the associated values, or even remove the flags proposed by QuackMe.The front-end has been built with the Angular framework and interacts via a Java back-end with the XML files processed by QuackMe.It can also activate interactively the sequence of the individual QuackMe modules.
Finally, QuackMe offers the possibility of integrating additional sources of data to be used as independent reference and/or to fill gaps in the subdaily observations under analysis.

Quality checks
The first module of the quality check runs weak controls on subdaily (either hourly or 3-hourly) meteorological observations as coming from the converter (Table 1).These controls aim at detecting obvious erroneous values that must be removed without the need of further analysis.Except for a few cases, at this step detected values are flagged as wrong and do not pass to the next step.Weak checks are mainly interval-based controls, i.e., they check whether a value lies in an interval defined by reasonable physical limits.For instance, relative humidity must be comprised in [0, 100], 2-m air temperature (expressed in °C) must lie in [-80, 60].Consistency controls are also performed.These are temporal checks on consecutive records and checks on dependent variables.Examples are difference in atmospheric pressure at time t and t 1 1, coherence between radiation and sunshine duration, and coherence between precipitation accumulated over different time periods (e.g., 1-, 3-, 6-, and 12-hourly).The full list of these quality controls is available in the software manual.All these controls are in line with the ones described by Fiebrich et al. (2010).
Once the aggregator component has created the daily values (see Table 2), intervalbased and consistency controls are performed again with different upper and lower bounds with respect to subdaily values.Values lying outside predefined intervals are then flagged as wrong.The aggregation step is customizable to address the specific needs of users that may be interested in other regions of the world besides the ones already implemented and made available.Subsequently, advanced controls are run.These ones are thresholdbased checks, with thresholds estimated  from a historical archive (here MARSMet; Toreti et al. 2019) by using a reference period and in offline mode to make faster the operational near-real-time use of QuackMe.These thresholds are either daily or seasonal according to the variable and the check to be performed.While the principle behind these controls is the same as in Fiebrich et al. (2010), their implementation is based on dynamic, adaptive techniques making use of different statistical tools and methods.In this way, the derived thresholds are location specific and at the same time reduce the number of correct values being flagged as suspicious.
The daily thresholds are percentile based, e.g., the 1st and 99th percentiles of daily mean temperature.They are estimated for each day of the year by using a 15-day window centered on the day of interest.Then, the upper (lower) envelope is estimated by selecting local maxima (minima) and smoothed by using Hermite spline.This procedure is similar to the one used in the empirical mode decomposition (EMD; Huang et al. 1998).The identified values are then taken as daily thresholds.When an observation lies above (below) the threshold, a suspicious flag is assigned.
At the seasonal scale, daily precipitation and wind speed are controlled by using an extreme value theory approach (EVT; Coles 2001), i.e., 10-yr return levels are estimated for each season.All values above these thresholds are flagged as suspicious.The 10-yr return levels are estimated by using a peak over threshold approach, i.e., inferring with maximum likelihood a generalized Pareto distribution on the exceedances above a selected threshold (here the 95th percentile) representing a trade-off between the need of having a large enough sample and being in the tail of the full distribution.
Concerning those stations not having long-enough archived data, thresholds for the percentile-based temperature checks are estimated by building a spatial model using kriging with external drift (Wackernagel 2003), with the primary variable being the one under analysis (e.g., 99th daily maximum temperature) and the auxiliary variable the same one but obtained from the latest ECMWF reanalysis ERA5 (Hersbach et al. 2020).
A large-scale extreme event may result in many stations' records flagged as suspicious.To deal with these cases and considering that meteorologists can check only a limited number of suspicious values in near-real-time operational mode, a dedicated method to detect large-scale induced events has been developed.The implemented approach starts with clustering all stations flagged as suspicious for the same parameter.An EM-clustering method with random initialization is applied (Dempster et al. 1977;Maitra 2009) together with the gap statistics (Tibshirani et al. 2001) to identify the optimal number of clusters.Then for each cluster, an α-convex hull is estimated (Edelsbrunner et al. 1983;Rodríguez Casal 2007;Pateiro-López and Rodríguez-Casal 2010).The α convexity generalizes the convexity condition of the set to be estimated based on the sample points.Here, these points are simply the locations of the stations reporting the same flag for the same check.In brief, the method looks for the smallest α-convex set containing all the sample points.Once the α-convex set is estimated, its area is derived.If the area is greater than a chosen threshold, then all suspicious flags of the stations located within the area are removed and the values are considered to be valid.The threshold is currently set at 71 3 10 3 km 2 , but it is editable in the associated XML file.After all these controls, values still flagged as suspicious can be checked and evaluated by meteorologists by using the visualization component of QuackMe.Wrong or untrustworthy values are deleted.Alternatively, corrected values can be entered in the front-end, which is then used for further processing and to rerun the check modules.A meteorologist can also classify values as trustworthy in the front-end.
Displaying data and the various graphical filtering options support experts in deciding whether the value under analysis is actually incorrect or trustworthy.The interactive map helps to identify mesoscale and local effects such as foehn, wind-exposed locations, or orographic rainfall.
Via the back-end, decisions made for all values are written in the QuackMe XML files.The subsequent processing step of QuackMe is then activated via the front-end, being either the rerun of the check modules or the start of the next module.To provide an overview of the flags a meteorologist deals with during the execution of QuackMe, Fig. 2 shows the histograms of the suspicious and wrong flags obtained in August and September 2021.Most days are characterized by a few flags, although the histograms point to right-skewed long tail distributions.This implies that, although rare, there may be days having higher number of flags.At the end of the entire process, values have either no flag or a sequence of flags.The former one means that the value passed all quality controls without detected issues in either the value itself or the subdaily data used to derive it.A sequence of flags can contain one or more of the following letters: S (suspicious), I (interpolated), and H (modified).The flag W (wrong) is not allowed in the final sequence, because all W-flagged values are removed and not permanently stored.The flag H is used to trace values modified by meteorologists via the front-end.The flag I is used to identify daily values that were built by using an incomplete set of subdaily data and with gaps filled by ad hoc procedures, e.g., based on additional adjusted sources of data.

Case studies
To show how QuackMe works but also to highlight the intrinsic limits and uncertainties in checking meteorological data from ground weather stations, three case studies are here discussed.The first one focuses on an event that occurred on 10 February 2020.It exemplifies the importance of the α-convex hull approach to avoid meteorologists dealing with a large number of flagged data.
Many pressure observations for 10 February 2020 from stations located in the Scandinavian Peninsula and Iceland were flagged by QuackMe (Fig. 3) due to their suspicious low values.These extremes were actually caused by Storm Ciara and the anomalous low pressure system over the region (Fig. 3).Before reaching the visualization component, the procedure dealing with a large number of stations flagged for the same parameter started.During the first step, four clusters were identified: three across Norway and Sweden and one for the stations in Iceland (Fig. 3).Then, for each cluster the α-convex hull was estimated and the area derived.Finally, all suspicious flags for those stations (except for one in Iceland that was lying outside the convex hull) were removed and meteorologists had to check just few isolated cases.
The second case study refers to 26 February 2020, when three stations reported 0 daily global solar radiation while five others reported unrealistic high values between 36 and 86 MJ m 22 (Fig. 4).As expected, QuackMe detected those values and flagged them as wrong.To help readers understand the case study, global solar radiation for the same day given by ERA5 is also shown for the entire European domain in Fig. 4. The third case study deals with a more complex situation where the meteorologist had to judge identified suspicious values and take action.On 14 February 2021, daily total precipitation derived from four ground weather stations exceeded the estimated 10-yr return level (Fig. 5) used as a threshold to flag suspicious values.Therefore, those values entered into the visualization component and were evaluated by a meteorologist according to the synoptic conditions.Based on that analysis, all values were accepted and the flags removed.To help readers in understanding the synoptic context the meteorologist evaluated, Fig. 5 shows the mean sea level pressure of that day, as given by the near-real-time JMA reanalysis JRA-55 (Kobayashi et al. 2015).By looking at that map, it is not easy to understand what triggered these extreme precipitation events.An Atlantic ridge expanding through the Iberian Peninsula to the Gulf of Biscay is clearly visible and also the low-pressure meridional belt extending from northeastern France to the central Mediterranean.Other sources of information were surely used by the meteorologist to reach a decision, e.g., satellite and radar data as well as precipitation from the operational short-term ECMWF forecasting system.

Discussion and conclusions
Daily meteorological data are an essential source of information for scientific studies, climate services, and associated socioeconomic activities.Identifying, and whenever possible correcting, erroneous values is key to ensure high-quality and reliable information, and to avoid cascading effects in the services and studies based on these data.QuackMe implements a new approach characterized by checks of increasing complexity.Besides the standard quality controls complemented with many additional cross and consistency checks, the system applies advanced threshold-based controls.These checks rely on historical information to derive thresholds by applying specific statistical methods, e.g., based on extreme value theory.Lack of historical archive is solved by using reanalysis and geostatistics in the case of temperature-based thresholds.A dedicated approach for precipitation and wind thresholds is still under development.Recently proposed methods (e.g., the fast semiparametric approach of Naveau et al. 2014) may offer interesting opportunities, although there exist issues in applying them to large domains.Among them, the most important is the often made stationary assumption of the parameters controlling the limit behavior of the distribution's tail (i.e., the shape parameter) which is clearly heavily violated for large areas.As shown by the case studies, the expert-based evaluation is still crucial, especially when extreme events occur.This is a time-consuming activity that QuackMe supports by using a dedicated automatic procedure to let the meteorologist focus on cases which are truly doubtful.Recently launched AI initiatives (e.g., Boukabara et al. 2021) and projects on AI and climate services (e.g., the EU H2020 project CLINT) may bring innovative approaches to reduce the burden of (and the time needed by) meteorologists for near-real-time daily evaluation.Regular updates of the thresholds, based on inferred distribution functions from past data, are also important to keep stable the number of suspicious values to be checked.Such updates, to be done every 5-10 years, can counterbalance the tendency toward more frequent and more intense extremes induced by climate change.
QuackMe represents an effective, adaptive, and highly customizable tool that addresses the need of weather services, practitioners, and scientists working with meteorological data.It is open-source and freely available under the GPLv2 license.It can be retrieved in the dedicated GitHub web page https://github.com/ec-jrc/QUACKME.

Fig. 1 .
Fig. 1. (top) The QuackMe architecture.The dashed arrows denote the iterative steps to be taken toward the final data and flags.The gray arrows denote the last steps of the QuackMe process.(bottom) The visualization and validation module of QuackMe.

Fig. 2 .
Fig. 2. Histograms of all suspicious and wrong flags (green in the upper panel and blue in the lower panel, respectively) obtained during the execution of QuackMe in August and September 2021.

Fig. 3 .
Fig. 3. First case study on the event that occurred on 10 Feb 2020.(a) Stations where flagged values were reported and the estimated α-convex hulls.Colors are associated with the identified clusters.(b) Daily mean sea level pressure anomalies derived from ERA5.

Fig. 4 .
Fig. 4. Second case study on the global solar radiation errors detected on 26 Feb 2020.The three stations that reported 0 MJ m 22 are represented in green, and the five stations with unrealistic high values are represented in pink.The filled contour map shows the daily global solar radiation as given by ERA5.