Abstract

Environmental measurements often show unsteady variability and background red noise (stochastic component) superimposed on persistent trivial daily, seasonal, and annual variability. In addition to this, environmental time series often present gaps due to a myriad of possible factors, such as malfunction of the sensors, connection loss, etc. As a result, interpreting and identifying periodicities in this type of time series by means of spectral analysis tools, like the Fourier transform, are difficult and lack precision. To overcome these difficulties, a methodology is proposed in the first part of this paper that integrates statistical tools (iterative Student’s t test), parametric reconstruction, and spectral analysis (Lomb periodogram and wavelets). In the second part of the paper, this methodology is tested (i) in the high-frequency part of the spectrum of two (well known) synthetic time series and (ii) to identify nontrivial (e.g., daily cycles) high-frequency periodicities (linked to some mesometeorological processes) in three tropospheric ozone time series recorded by the Valencia regional air quality monitoring network (on the Mediterranean side of Spain) during a 14-yr period. This methodology can determine statistically significant, seasonally dependent recurrences in the high-frequency variability (<15 days) observed in ozone time series measured in a Mediterranean region of Spain under high noise-to-signal ratios.

1. Introduction

Environmental time series frequently show features resulting from the physical, chemical, or biological processes that drive the recorded parameter (temperature evolution, oxidation rate, population growth rate, etc.). Determining the intensity or persistence in time of these features is one of the goals of time series analysis. Specifically, one can see this idea in studies related to finding cycles in paleoclimatology (Debret et al. 2007; Bond et al. 2001), searching for long-term persistence (Lanfredi et al. 2009; Matsoukas et al. 2000), and analyzing trends or seasonality (Millán et al. 2005).

In the case of environmental time series driven by climatic or meteorological forcing, the time variability of the parameter under study can be modulated not only by “trivial” meteorological cycles (i.e., daily, seasonal, and annual variations) but also by “not so trivial” signals superimposed on these cycles.

Nevertheless, climatic or meteorological time series frequently have strong temporal dependencies and show nonsteady variabilities throughout the years. In some cases, by decomposing time series into time–frequency space through a continuous wavelet transformation, it is possible to determine both the dominant modes of variability and how those modes vary in time (Lau and Weng 1995). The difficulty of this procedure arises from the fact that measurements usually have some low-powered, or weak, fluctuations that might be masked by more persistent components (usually due to the aforementioned trivial meteorological cycles).

According to Hasselmann’s (1976) approach to stochastic scale separation, the variability of climatic series can be interpreted in terms of characteristic time scales or well-differentiated time cycles, and thus the spectral bands in frequency space are identified as footprints of physical processes with different characteristic time scales. Therefore, low frequencies in climatic time series are identified with climatic subsystems that show a slow variation with time (oceans, ice cover, etc.). Furthermore, Orlanski (1975) proposed a classification of atmospheric processes in terms of their spatial dimension and duration. With this classification, it is possible to delimit the harmonic analysis to the relevant time scale when studying a particular atmospheric process.

From another point of view, climatic and meteorological time series often present gaps due to a myriad of causes, such as malfunction of the sensors, connection loss, errors in the data acquisition system, etc. For this reason, most of these time series should be regarded as unevenly sampled data. It may thus be necessary to use the Lomb–Scargle periodogram (Lomb 1976; Scargle 1982) to perform a correct space–frequency analysis. In this sense, Hocke and Kämpfer (2009) have described a new method for filling gaps in atmospheric datasets using the Lomb–Scargle periodogram.

Apart from gap filling, the main problem in harmonic analysis at frequency space is to establish which harmonic belongs to a characteristic process and which is linked to the intrinsic noise of the time series. This intrinsic noise is generated by the stochastic climatic component of the natural environment, adding a noisy background signal in the geophysical time series. High values in the power of this noise are able to mask cycles due to physical processes—decreasing the signal-to-noise ratio. Different approaches have been used to deal with this issue. Hasselmann (1976) demonstrated that a first-order autoregressive (AR1) process is sufficient to explain the climatic red-noise signature. Horne and Baliunas (1986) used the false-alarm probability to detect the presence and significance of a frequency in an uneven time series.

In this paper, we propose another method to analyze nonsteady principal harmonics in noisy meteorological time series with the final aim of obtaining some statistically significant periods (or frequencies) embedded in the dataset. We use a parametric reconstruction based on a finite sum of significant sinusoidal signals. To perform this parametric reconstruction we use the Lomb–Scargle periodogram to minimize the gap effect. We use the statistical properties of the parameter to determine how many sinusoidal waves have to be reconstructed. The inverse of the frequency parameter of each component will be a significant period that explains the original time series. We can identify these significant periods with periods due to external forcings once we have removed the stochastic component.

It is necessary to consider the mesoscale (Orlanski 1975), both spatially and temporally, across the whole study area to quantify and parameterize the dynamic patterns of the lower troposphere and their implications for climate and air quality (Salvador et al. 1999). The time scale in the mesoscale makes it very difficult to study the periodicity, statistical significance, and persistence (recurrence) of patterns in environmental time series because of the stochastic variability typically present in environmental time series. At present, only descriptive studies of air quality time series (daily and seasonal cycles, interannual trends, etc.) in different airsheds and/or quantitative mesocale modeling analysis for short-term scenarios can be found in the scientific literature (Millán et al. 1996, 2000; Gangoiti et al. 2001; Moussiopoulos et al. 2005; Palau et al. 2005b, 2009). The weakness of these approximations lies in the difficulty of studying the high-frequency periodicities of a large number of long time series to justify the spatial and temporal representativity of the effect of mesocale atmospheric processes on the time evolution of a given climatic, meteorological, or chemical parameter. The number of time series to be considered and their temporal length will depend on the spatial area to be analyzed and the signal-to-noise ratio.

In line with the aforementioned studies, in this paper we test a new methodology for carrying out statistically significant analyses of the impact of high-frequency weather cycles or periodicities (with low signal-to-noise ratio) on environmental parameters (such as air pollutants). Thus, we test the methodology with “real” environmental (noisy and nonsteady) “long” (14 yr) time series obtained from three stations in an automatic regional air quality monitoring network in Spain. The aim of this test is to see if the method proposed is capable of obtaining the statistical significance of the harmonics associated with the aforementioned high-frequency cycles (see further details in the next subsection).

To close this section we provide a brief description of the background of the environmental time series used herein to study the statistical significance of some of the “nontrivial” and seasonally and geographically dependent periodicities observed when plotting environmental time series recorded in the western Mediterranean basin (WMB). The second section of this paper describes our proposed method. This is followed by a section on how the method is applied, indicating how significant cycles can be extracted from a selected scale. In the case described, we first used a synthetic time series, generated in such a way that it was similar to a typical meteorological year dataset, and then we applied our method to three tropospheric ozone time series recorded over a 14-yr period at some stations in the Valencia region (Spain) automatic air quality monitoring network.

Background: Tropospheric ozone cycles in the Mediterranean basin

Tropospheric ozone variability is strongly dependent on meteorological fluctuations and anthropogenic influences. This is because ozone formation is related to temperature, solar radiation, NOx emissions, and volatile organic compound (VOC) concentrations (Sebald et al. 2000). Thus, since meteorology and anthropogenic emissions have cyclical patterns, it is reasonable to expect that the variability in the ambient ozone concentration will also have periodic behavior at different time scales. Cvitaš et al. (2004) obtained quasi periods ranging between 7 and 44 days during the project European Experiment on Transport and Transformation of Environmentally Relevant Trace Constituents in the Troposphere over Europe, subproject Tropospheric Ozone Research (EUROTRAC-TOR), and Sebald et al. (2000) found ozone peaks in monitoring station measurements corresponding to the range of 16–35 days, relating them to synoptic-scale fluctuations.

Several European projects carried out in the western Mediterranean basin from the 1990s to the present [Meso-Meteorological Cycles of Air Pollution in the Iberian Peninsula (MECAPIP), Regional Cycles of Air Pollutants in the West-Central Mediterranean Area (RECAPMA), South European Cycles of Air Pollution (SECAP), Regional Assessment and Modelling of the Carbon Balance within Europe (RECAB), Integrated Systems for Forecasting Urban Meteorology, Air Pollution and Population Exposure (FUMAPEX; http://fumapex.dmi.dk), and Climate Change and Impact Research: the Mediterranean Environment (CIRCE; http://www.circeproject.eu/)] have experimentally documented a vertical recirculation of air masses over the Mediterranean linked to coastal sea breezes (Millán et al. 1996, 2000; Moussiopoulos et al. 2005; Palau et al. 2005a). This vertical recirculation favors the accumulation of water vapor and pollutants emitted to the troposphere in the Mediterranean (Gangoiti et al. 2001). Based on these projects and case studies, some periodicities using experimental and modeling procedures have already been published (Table 1).

Table 1.

Observed and modeled periodicities of tropospheric ozone in the WMB.

Observed and modeled periodicities of tropospheric ozone in the WMB.
Observed and modeled periodicities of tropospheric ozone in the WMB.

In the WMB, the accumulation of pollutants and water vapor is characterized by both marked daily cycles and maximum daily values that increase gradually with each consecutive day (Millan et al. 1996, 2000; Moussiopoulos et al. 2005). In this paper, we will refer to this characteristic pattern of a gradual increase in concentrations as the “recharging period” or “accumulation.” The modulations in the maximum ozone concentration values (recharging periods) shown in Fig. 1 have been observed at all the monitoring stations on the east side of the Iberian Peninsula regardless of their specific geographical location. Nevertheless, their occurrence at any particular moment depends both on the distance of the station to the coast and the season of the year. Differentiated and characteristic patterns are systematically found according to the relative position of the station with respect to the emission sources, distance to the coast, and altitude (Millan et al. 2000). Thus, five characteristic site types have been distinguished according to their daily O3 pattern (Millán et al. 1992, 1997, 2000): mountaintop (MT), upper valley floor (UV), intermediate valley, coastal plain (C), and coastline sites.

Fig. 1.

Portion of the ozone time series for the selected monitoring station. Time series used to test the methodology correspond to the daily maximum values (often situated in the central hours of the day). The time evolution of the time series generated from the daily maximum values is related to the recharging periods and recirculation of atmospheric pollutants in the region.

Fig. 1.

Portion of the ozone time series for the selected monitoring station. Time series used to test the methodology correspond to the daily maximum values (often situated in the central hours of the day). The time evolution of the time series generated from the daily maximum values is related to the recharging periods and recirculation of atmospheric pollutants in the region.

2. Procedure: Data analysis

The reconstruction flowchart (Fig. 2) depicts our proposed methodology for analyzing nonsteady principal harmonics in noisy meteorological time series. The key issue is to obtain the optimum number of components in the parametric reconstruction of the original data to prevent spurious harmonics due solely to the stochastic component. Following the principle of parsimony, as the reconstruction gets more unnecessarily complex (increasing the number of harmonics), one ends up including elements lacking any significant information.

Fig. 2.

Flowchart of the parametric reconstruction method. The reconstruction method is used with the Lomb algorithm to find the principal frequencies. A: We use the t test in the amplitude of each component of the initial parametric reconstruction. If a term does not pass the test, we decrease the number of sinus in the summation and redo the algorithm reconstruction, until all the amplitudes in the addend are above the tc.

Fig. 2.

Flowchart of the parametric reconstruction method. The reconstruction method is used with the Lomb algorithm to find the principal frequencies. A: We use the t test in the amplitude of each component of the initial parametric reconstruction. If a term does not pass the test, we decrease the number of sinus in the summation and redo the algorithm reconstruction, until all the amplitudes in the addend are above the tc.

In the reconstruction flowchart (Fig. 2), there are five stages that deserve a detailed description: detrending time series, filling gaps, reconstructing time series, choosing only statistically significant harmonics, and analyzing reconstructed time series.

Detrending is applied to erase the long-term change in the mean of the time series, which may mask other features of interest. As our focus is the study of high-frequency recurrences in environmental time series, and specifically features present at the mesoscale [i.e., time scales less than 1 month, following the Orlanski (1975) classification], we need to avoid not only annual and year-on-year trends but also seasonal variations. For this reason, besides the signal detrending, we also perform a high-pass filter, allowing only frequencies associated with periodicities lower than 90 days (which correspond to the harmonics of seasonal periodicities), and a low-pass filter above the Nyquist frequency to avoid aliasing effects.

After this detrending and filtering, for our study of periodicities at high frequencies, it would theoretically suffice to perform the traditional Fourier transformation of the original time series to the frequency domain; however, data loss is a generalized problem in geophysical time series and a Fourier transformation is not always possible. Therefore, in most cases, it is necessary to deal with the problem of filling data gaps in long time series. One way to avoid this problem is to treat the original gapped time series as unevenly (unequally spaced) sampled data. For this reason, we use the Lomb–Scargle periodogram, instead of the traditional Fourier transformation, to transform the original data obtained in the time domain to the frequency domain.

Once the Lomb–Scargle periodogram has been obtained, it is necessary to perform a significance test comparing the spectrum of the original times series with a red-noise spectrum (background noise) to check if there are any harmonics above the confidence levels. Environmental time series characteristically show a low signal-to-noise ratio at high frequencies. When this occurs, that is, when there are no harmonics higher than the red-noise levels, a parametric reconstruction of the original time series must be carried out.

The parametric reconstruction provides an approximation of the original time series by summing the waves (harmonics), where each summand is a function of three parameters: amplitude, frequency, and phase. To determine the frequency of each harmonic we use a fast algorithm (Press and Rybicki 1989) to solve the Lomb–Scargle periodogram (Lomb 1976).

Since the amplitudes can be assumed to be the weights of each one of the harmonics (summands) in the parametric reconstruction, we perform a Student’s t test of the amplitudes to find the maximum size (maximum number of summands) of the parametric reconstruction. Whereas a sufficiently high initial number of harmonics ensures that all the significant harmonics are considered, an excessively high number of summands will reproduce the environmental red noise masking the harmonics related to significant periodicities. Through an iterative procedure starting with a sufficiently high initial number of harmonics (N), we decrease their number progressively until all the summands have amplitudes with t values higher than a t critic (95%). The t value depends on the amplitude and on its error, which are recalculated in each iteration. The t critic depends on the degree of freedom and it is also fixed in each t test (because N decreases in each iteration). The iteration finishes when all amplitudes (and their corresponding harmonics) are significant and therefore the reconstructed series has the optimum number of harmonics maximizing the signal-to-noise ratio.

Through the iteration process, it is important to remark that each parametric reconstruction must pass an F test to ensure the statistical significance of the correlation between the original time series and the modeled one.

The significant harmonics representing the parametric reconstruction of the original time series allow us to quantitatively identify the frequencies of the high-frequency cycles present in the data or, through direct transformation from frequencies into periods, their corresponding time periods (hours, days, etc., depending on the time scale in the original data sampling). To fit the sampling time scale to the periods obtained from the parametric reconstruction, periods belonging to the same sampling time scale should be accumulated in one single period, averaging the amplitudes involved.

Finally, to test the time correlation between the original time series and the reconstructed one, it is possible to perform a coherence wavelet test that determines variations in the distribution of the power for each harmonic (relative predominance) among the different harmonics in the time series over time (Torrence and Compo 1998).

Specifically, we chose the Morlet function to carry out the continuous wavelet transform. One single Morlet wavelet (or wavelet basis) is defined as

 
formula
 
formula

In these equations, η is a nondimensional time parameter, m is the wavenumber, and H is the Heavyside function [H(ω) = 1 if ω > 0, H(ω) = 0 otherwise].

This complex function consists of a plane wave modulated by a Gaussian function. Working with a complex wavelet is an advantage, as it allows us to extract information about the amplitude, phase, and main behavior of the oscillations. Furthermore, if we consider only the real part, we can detect isolated peaks or discontinuities.

The continuous wavelet transform of a time series (xn = 1, … , N) is defined as a convolution of the wavelet through time (xn):

 
formula

The wavelet coherence shows how coherent the modeled and the initial time series are. The wavelet coherence has the expression

 
formula

which is formally identical to a localized correlation coefficient. In this way, we are able to check whether the behavior of the reconstructed and the original time series is equivalent and to establish the time range of this coherence.

Grinsted et al. (2004) suggested the implementation of this function to study the relationships in time–frequency space between two different geophysical time series that could have common links through the climate system. In our method, we use the coherence wavelet to check the time evolution of the local correlation, in time–frequency space, between the original time series and the parametric reconstruction.

3. Results

In this section, we present the results obtained after applying the aforementioned methodology to two different datasets. We first apply the parametric reconstruction to a well-known synthetic (artificial) time series, and then we apply the same methodology to real environmental time series (specifically to three tropospheric ozone time series measured hourly and simultaneously at different sites in the same airshed on the Mediterranean side of the Iberian Peninsula during a 14-yr period).

First, when applying the method to the synthetic time series, the objective is to test the methodology by analyzing to what extent the reconstruction is able to differentiate between the significant harmonics (which are, in this case, known) and the random background noise generated to decrease the signal-to-noise ratio at high frequencies. To check the potentiality of the parametric reconstruction proposed in this paper, we compare it with the results obtained after applying a traditional methodology (i.e., using a linear interpolation for filling gaps and performing a Fourier transformation of the time series).

Second, we show how applying our methodology makes it possible to identify the harmonic components of greatest importance in three ozone time series, and show good agreement with the previously identified regional atmospheric processes that govern the time evolution and spatial distribution of tropospheric ozone concentrations.

a. Test with a synthetic time series

The synthetic time series has been generated to include the typical (trivial) environmental periodicities: annual, seasonal, and diurnal. Furthermore, we have added to the synthetic signal a random Gaussian noise variability with a mean of zero and a standard deviation of 50% of the function value at each time point. In this way, a nonsteady (not stationary) and noisy signal is generated showing the most common features of the environmental time series (Figs. 3, 8).

Fig. 3.

Synthetic gapped time series used to test the methodology.

Fig. 3.

Synthetic gapped time series used to test the methodology.

We consider a time series with 360 data points representing the annual variability of four periodicities (harmonics). These points are distributed in the time series as follows: 10-day harmonics are used in the first 90 days (representing the first season of the year), followed every 90 days thereafter by consecutive harmonics of 15, 23, and 30 days (Table 2). Furthermore, we have introduced an annual trend as a 360-day period with a time shift (or phase), so that the maximum is not necessarily in the middle of the time series. And we have also superimposed on the above-described harmonics throughout the 360 days a high-frequency harmonic with a periodicity of 2.4 days.

Table 2.

Function of the synthetic time series representing the annual evolution of an environmental data series. Time represents days of the year. The harmonics present in this function correspond to the periods of 2.4, 10, 15, 23, 30, and 360 days. A random noise with 50% standard deviation is added throughout the whole year.

Function of the synthetic time series representing the annual evolution of an environmental data series. Time represents days of the year. The harmonics present in this function correspond to the periods of 2.4, 10, 15, 23, 30, and 360 days. A random noise with 50% standard deviation is added throughout the whole year.
Function of the synthetic time series representing the annual evolution of an environmental data series. Time represents days of the year. The harmonics present in this function correspond to the periods of 2.4, 10, 15, 23, 30, and 360 days. A random noise with 50% standard deviation is added throughout the whole year.

To check the capabilities of the methodology for the parametric reconstruction of this synthetic time series, we have simulated data loss by generating gaps at different times and with different extensions within the dataset. A representation of the synthetic time series, with and without gaps, is presented (Fig. 3). The five periods (harmonics) introduced in the synthetic time series (Table 2) are also shown (Figs. 4a and 5a).

Fig. 4.

Periods using the traditional Fourier transformation. Only frequencies with more than a 95% confidence level are shown. (a) Initial periods used to generate the synthetic time series. (b) Time series reconstruction without gaps. (c) Time series reconstruction with random gaps.

Fig. 4.

Periods using the traditional Fourier transformation. Only frequencies with more than a 95% confidence level are shown. (a) Initial periods used to generate the synthetic time series. (b) Time series reconstruction without gaps. (c) Time series reconstruction with random gaps.

Fig. 5.

Periods useful and significant for a noisy and nonsteady synthetic time series. The effect of the gaps is shown here. (a) Initial periods used to generate the data. (b) The time series reconstruction without gaps. (c) The reconstruction with random gaps.

Fig. 5.

Periods useful and significant for a noisy and nonsteady synthetic time series. The effect of the gaps is shown here. (a) Initial periods used to generate the data. (b) The time series reconstruction without gaps. (c) The reconstruction with random gaps.

1) Fourier analysis

First, we perform an interpolation to fill the gaps in the synthetic times series and, second, we apply a traditional Fourier transformation to extract periodicities with a 95% confidence level over noise (Fig. 4c). High frequencies included in the synthetic data (2.4, 10, and 15) are not reproduced using this method but, in this case, we cannot assert that this masking is due only to the gaps introduced in the time series. It is important to note that, in the definition of this synthetic time series, we added some background noise and this component may also be responsible for the masking of the periodicities. To show the effect of the presence of gaps in this synthetic time series, we performed again a Fourier transformation of the synthetic time series but without gaps (Fig. 4b). After applying the Fourier transformation on gapped and ungapped synthetic time series, no differences were obtained between either of them (Figs. 4b,c). This result suggests that the signal-to-noise ratio is a more important factor than the presence of gaps in this synthetic time series. As a matter of fact, the effect of gaps strongly depends on their distribution of gaps within the time series (if they mask a whole cycle, if the occurrence of gaps is periodic, etc.).

2) Parametric reconstruction

The reconstruction of the time series without gaps (Fig. 5b) reproduces the five simulated periods with a deviation of 1 day. Nevertheless, some noise is also reproduced in the periods of 21 and 34 days. The reconstruction of the time series with gaps (Fig. 5c) also reproduces the synthetic modes of the original time series with 1 day of deviation. In this latter case, a harmonic associated with a 38-day period appears, probably due to the effect of the introduced gaps and/or the background noise.

The 1-day deviation in the reproduction of the periods is due to the fact that the time resolution of the original time series is 1 day, and therefore, in the best case, this is the minimum time resolution one can expect to obtain from the parameterized signal.

As explained in the second section, wavelet coherence shows the time-dependent correlation between the data and the reconstructed time series. Grinsted et al. (2004) suggested a graphical representation (Fig. 6) of coherence wavelets calculated following the Torrence and Compo (1998) guide. In Fig. 6, clearer areas (white) represent the highest coherences (coherence values close to 1). The cone of influence, or region of the wavelet spectrum in which edge effects are important, is also represented; values outside the cone of influence should not be considered. Vectors in Fig. 6 represent the phase between the two time series involved in the coherence wavelet; in-phase vectors point right, whereas antiphase vectors point left. This representation of the coherence wavelet gives an idea of the evolution of the persistence (periods in y axis) throughout the whole time series (days in x axis). Finally, the statistical significance levels (bold lines in Fig. 6) are calculated using Monte Carlo methods as described by Torrence and Compo (1998).

Fig. 6.

Representation of wavelet coherences as suggested by Grinsted et al. (2004). Thick contours show the 5% significance level against red noise. The arrows represent the relative phase; in-phase vectors point right, whereas antiphase vectors point left. Generated data (left) without and (right) with gaps and its parametric reconstruction.

Fig. 6.

Representation of wavelet coherences as suggested by Grinsted et al. (2004). Thick contours show the 5% significance level against red noise. The arrows represent the relative phase; in-phase vectors point right, whereas antiphase vectors point left. Generated data (left) without and (right) with gaps and its parametric reconstruction.

Coherence wavelets for both synthetic cases (with and without gaps; Fig. 6) show good time-dependent correlations. The synthetic time series without gaps (Fig. 6, left) present a lower level of precision than the ones with gaps (Fig. 6, right). The harmonics of the parametric reconstruction obtained in the first case are less localized than in the second case. This is due to the fact that during the reconstruction we obtained more significant “reconstructed” harmonics around the real harmonics (Figs. 5, 6).

Seasonality is clearly shown by the evolution of the significance levels throughout the year (360 data points). Four areas with high correlation, inside the 5% significance level against red noise, are clearly displayed in Fig. 6. During the first 90 days, high-significance harmonics are found around the 12-day period for the nongapped series and around the 8-day period for the gapped series. For the second 90-day period of the year, significant harmonics are found around the 15-day period in both cases, etc. Furthermore, the highest frequency introduced in the synthetic time series, the 2.4-day period, appears in the reconstructed harmonics as a discontinuous coherence of less than 3 days (Fig. 6). In spite of this discontinuity, statistically significant levels are present in high-frequency harmonics throughout the whole time series length.

b. Test with tropospheric ozone time series

1) Castellon (Spain) air quality surveillance automatic network

In this section we apply our methodology for the parametric reconstruction of nonsteady and noisy environmental data series to several time series obtained from the Valencia regional air quality monitoring network (in eastern Spain; Fig. 7). This network, which is distributed throughout the whole Valencia Autonomous Community, consists of 29 automatic monitoring stations that take continuous measurements of a number of air pollutants (SO2, NOx, O3, …).

Fig. 7.

Spatial distribution of the selected monitoring stations in the northern part of the Valencia region. All the stations are under the influence of the mesoscale atmospheric dynamics in the region, showing different patterns depending on their distance to the coast (Millán et al. 2000).

Fig. 7.

Spatial distribution of the selected monitoring stations in the northern part of the Valencia region. All the stations are under the influence of the mesoscale atmospheric dynamics in the region, showing different patterns depending on their distance to the coast (Millán et al. 2000).

Herein we use the data from three monitoring stations (Table 3) that represent the spatial pattern of the tropospheric ozone time evolution in the whole region [as justified by Millán et al. (2000)]: a coastline site (Grao station, type C), an upper valley rural station (Vila station, type UV, situated 45 km inland), and a mountaintop site (Morella station, type MT situated 50 km inland from the coastline). The stations are equipped with a DASIBI 1008-RS monitor that continuously measures the ozone concentration at 10-min intervals.

Table 3.

Main characteristics for each monitoring station considered. Temporal coverage is expressed in years. The “Percent coverage” corresponds to the percentage of valid data after a validation process (i.e., excluding gaps and erroneous data). Number of measures (or data) corresponds to the whole extension of time series. Station types from Millán et al. (2000): typology established in relation to the geographical location of stations in the area.

Main characteristics for each monitoring station considered. Temporal coverage is expressed in years. The “Percent coverage” corresponds to the percentage of valid data after a validation process (i.e., excluding gaps and erroneous data). Number of measures (or data) corresponds to the whole extension of time series. Station types from Millán et al. (2000): typology established in relation to the geographical location of stations in the area.
Main characteristics for each monitoring station considered. Temporal coverage is expressed in years. The “Percent coverage” corresponds to the percentage of valid data after a validation process (i.e., excluding gaps and erroneous data). Number of measures (or data) corresponds to the whole extension of time series. Station types from Millán et al. (2000): typology established in relation to the geographical location of stations in the area.

As stated by Millán et al. (2000), the observed ozone cycles depend on the extension/strength of the mesoscale circulations (sea breezes) and the topographic location of the observing station and they show large differences in tens of kilometers: the Grao station (type C) is immersed within the sea breezes practically the whole year, and thus the ozone concentrations recorded at this site come from the reservoir layers accumulated over the sea as a result of vertical recirculation dynamics (Millán et al. 2000). The typical O3 cycles at the Vila and Morella mountain stations (types UV and MT) are influenced by sea breezes mainly during summertime. Therefore, during the rest of the year (winter season), these two inland stations are not so sensitive to the accumulation (recharging) processes (see subsection in introduction for a detailed description). Morella, located on the top of a mountain, is directly influenced by synoptic flows while Vila, an upper valley rural station, is more frequently embedded in mesoscale circulations. In this paper we discard urban site stations to avoid the hebdomadal cycle (Pryor and Steyn 1995), as we are interested in studying only meteorological patterns and not anthropogenic-related periodicities.

From a meteorological point of view, there are two main prevailing and differentiated meteorological regimes that explain the ozone concentrations and their daily evolution at the different monitoring stations: winter and summer.

During summertime all the stations in the region, even the MT-type Morella station, are embedded within the sea-breeze circulations and, therefore, their time series should present a periodicity linked to the recharging periods (with characteristic time scales of 3–9 days according to the literature; Millán et al. 1996, 2000; Moussiopoulos et al. 2005; Gangoiti et al. 2001).

During wintertime, only the time series recorded at C-type stations and, to a lesser degree, at UV-type stations show recurrent recharging periods.

2) Tropospheric ozone recharging periods in Castellon

To test our methodology, we have applied it to three different 14-yr-long hourly time series for the aforementioned sites to check both if there is a statistically significant harmonic (periods) that might correspond to the recharging periods observed in the region in different (a few days long) case studies, and if these periods are seasonally and spatially dependent according to the aforementioned previously published results (Millán et al. 1996, 2000; Moussiopoulos et al. 2005; Gangoiti et al. 2001).

The main objective of this section is to match the significant signals embedded in the ozone time series with meteorological processes observed previously, leaving new physical characterization and interpretation to a future study.

For this propose, we calculated the diurnal maxima for each day (as an average of three consecutive hours) for the whole time series. This gave us a collection of daily maxima with which to analyze the diurnal behavior of the highest daily ozone value. Figure 1 shows a portion of the original time series (hourly data), where two 4-day accumulation cycles are easily identifiable, and Fig. 8 shows the ozone daily maxima time series for the whole 14-yr period and for each monitoring station.

Fig. 8.

The 14-yr-long ozone daily maxima time series for each of three monitoring stations in the Valencia air quality surveillance network.

Fig. 8.

The 14-yr-long ozone daily maxima time series for each of three monitoring stations in the Valencia air quality surveillance network.

Then, to analyze any signals linked to recharging periods (avoiding seasonal periodicities), we split the time series (Fig. 8) into summer and winter. In this way, we obtain two time series for each station: one time series with 14 summer periods and the other with the corresponding 14 winter periods. We applied the methodology as shown in Fig. 2 for the summer and winter datasets, performing a 50-day filtering to emphasize the cycles below this value (i.e., the time scales related to the mesoscale) and avoiding the eventual interannual discontinuities that could be present every 92 days (as a direct consequence of joining together all 14 winters and summers).

We use the recharging periods as components linked to the meteorological processes driving part of the variability in the time series. Thus, the graphs in Fig. 9 are generated with the inverse of the frequency obtained in the parametric reconstruction of the measured time series. As we are interested in the high-frequency part of the spectra, we only represent periods lower than 30 days in the x axis of Fig. 9; moreover, the “1-day” period is not shown in the abscissas axis because the dataset has a sample rate of 1 day.

Fig. 9.

Statistically significant principal harmonics (t test of the amplitudes) that explain the variability observed in the original time series. Graphics present the amplitude and the period of each sinus of the reconstructed time series for each monitoring station in wintertime and summertime.

Fig. 9.

Statistically significant principal harmonics (t test of the amplitudes) that explain the variability observed in the original time series. Graphics present the amplitude and the period of each sinus of the reconstructed time series for each monitoring station in wintertime and summertime.

The spectra of the reconstructed time series for each station (in time–space) show a clear seasonal bias (Fig. 9). In summertime, the Grau station has statistically significant principal harmonics on days 5, 7, 8, 11, 12, and 15. Periods 5, 7, and 8 appear only at the Grau station, while the higher periods are present at all the stations, despite the season. These results are in good agreement with the previously published experimental evidence (Table 1) regarding vertical recirculation of air masses and accumulation of pollutants in recharging periods (with observed recharging periods of 4–5 days and modeled recirculations of more than 5 days).

The periods detected at the Grau station in summer (periods of 5, 7, 8, and 11 days) and in winter (periods of 9 and 11 days) can be explained by the mesoscale processes (sea-breeze circulations driving the previously described pollutant recharging periods) that prevail on this type of site (coastal station) during all four seasons of the year (Millán et al. 2000). The other monitoring stations, located inland, show seasonal differences in their spectra because in wintertime they are frequently under synoptic meteorological fluctuations (typically periods higher than 9 days), whereas in summertime they are embedded in mesoscale circulations.

Coherence wavelets (Fig. 10) show statistical coherence between the reconstructed and the original time series (both filtered at 50 days and 0.02 Hz). The thick contour shows the 5% significance level against red noise. The cone influence is the area where the edge effects could distort the graph, and it is shown as a bleached area. The arrows indicate the phase relationship between the two time series. Wavelet coherence represents the continuity of the periods through the years in the two seasons studied. Unlike the synthetic time series analyzed in section 3a, the wavelet coherences are not so good here because the signal-to-noise ratio in environmental series is very high (mainly at higher frequencies, i.e., lower periods). Nevertheless, some relevant features can be described from Fig. 10.

Fig. 10.

Wavelet coherence between the reconstructed time series and the initial dataset, both filtered at 50 days (0.02 Hz). The thick contour shows the 5% significance level against red noise. The cone influence is the area where the edge effects could distort the graphics, and it is shown as a bleached area. The arrows indicate the phase relationship.

Fig. 10.

Wavelet coherence between the reconstructed time series and the initial dataset, both filtered at 50 days (0.02 Hz). The thick contour shows the 5% significance level against red noise. The cone influence is the area where the edge effects could distort the graphics, and it is shown as a bleached area. The arrows indicate the phase relationship.

The most evident feature is that the summer behavior is more discontinuous (interannual variability) than the winter behavior. In wintertime, meteorological processes associated with synoptic conditions (lower-frequency time scales) are persistent and prevail through consecutive years. Higher coherences are found in the region of synoptic frequencies mainly in wintertime (corresponding periods ranging from weeks to months; Orlanski 1975). During this season, only the coastal station (Grau) presents a degree of coherence at higher (mesoscale) frequencies (mesoscale time scales typically range from a few days, following the Orlanski classification).

In summertime, coherences at higher frequencies are present at all stations (although the more inland the station, the lower the coherence value, i.e., the more weakly it is influenced by coastal sea-breeze circulations). At the mesoscale, the highest coherence values at the two inland stations were obtained between 9 and 11 days, while at the coastal station they were obtained between 5 and 11 days.

Regardless of the season, the interannual variability at lower periods is reasonable considering the above-mentioned greater noise-to-signal ratio at high frequencies. Nevertheless, regions with a 5% significance level against red noise are always present.

4. Conclusions

Our parametric reconstruction of environmental nonsteady and noisy time series has allowed us to reproduce the main significant periodicities (harmonics). This is a challenge, especially for the high frequencies in environmental time series where the signal-to-noise ratio is usually very low. Therefore, the limit of applicability of the methodology proposed in this paper is determined by the power of the noise in relation to the power of the periodicity (signal-to-noise ratio).

The methodology proposed has been tested by applying it to two synthetic time series (with and without gaps in the dataset) and to three real environmental time series. Moreover, the new methodology has been compared with the results obtained from a traditional Fourier transformation. While traditional Fourier transformation was not capable of reproducing the principal harmonics of the (gapped and ungapped) synthetic time series, the new methodology proposed in this paper reproduced completely, in both cases, the time dependence of the frequencies (periodicities) present in the original synthetic dataset. Results from the parametric reconstruction were obtained with uncertainties on the order of the time resolution of the initial synthetic datasets (1 day). It is remarkable that the new method showed very similar skills when applied to both the ungapped and the randomly gapped datasets.

When applied to a “real” environmental dataset, as a general conclusion, our methodology has been capable of determining statistically significant recurrences in high-frequency periodicities under high noise-to-signal ratios. These periodicities are in good agreement with the ozone cycles that have been identified, measured, and modeled on the Mediterranean side of Spain in different field campaigns and case studies from the 1990s up to the present (Table 1). The statistical significance of the identified periodicities in the three 14-yr-long ozone time series broadens the temporal and spatial representativity of the previous mesometeorological processes identified during field campaigns of a few days long and short-time numerical simulations on selected meteorological episodes.

The added value of our parametric reconstruction of these ozone time series is the statistical significance of the results. Previous studies were focused on qualitative descriptions of time series trends and recurrences (mainly seasonal and daily cycles) and on quantitative analyses of several case studies. The test described in this paper presents results that not only show good agreement with previous conclusions, but that were obtained by analyzing an entire 14-yr period and quantifying the statistically significant persistence at high frequencies.

From the parametric reconstruction of three 14-yr-long time series, which represent the spatial pattern of the tropospheric ozone time evolution in the whole region of Valencia (Spain), it is possible to assert that

  1. depending on the season and the location of the monitoring station with respect to the coast, characteristic times of high-frequency tropospheric ozone variabilities range from 5 to 11 days. Comparing the results with previous meteorological studies, these variabilities are in good agreement with the characteristic time scales of mesoscale processes driving recharging periods of tropospheric pollutants in the studied area.

  2. during summertime, some high-frequency variabilities (5–11 days) are present in the whole study area, whereas during wintertime there is a decoupling between the coastal station and the inland stations.

  3. regardless of the season, longer time scales (>11 days) are present in the whole study area. These longer periodicities might be attributable to higher meteorological spatial scales that affect the whole studied region and, therefore, to synoptic meteorological scales.

Acknowledgments

The Instituto Universitario CEAM-UMH is partly supported by the Generalitat Valenciana (València, Spain). The authors would like to acknowledge J. Scheiding’s editing of the English text. This study has been partially funded by the Ministerio de Educación y Ciencia, research projects MODELISMOS (CGL2010-17623/CLI), GRACCIE [CSD2007-00067, Programa CONSOLIDER-INGENIO 2010 and FEEDBACKS (Prometeo-Generaltat Valenciana)].

REFERENCES

REFERENCES
Bond
,
G.
, and
Coauthors
,
2001
:
Persistent solar influence on North Atlantic climate during the Holocene
.
Science
,
294
,
2130
2136
.
Cvitaš
,
T.
, and
Coauthors
,
2004
:
Spectral analysis of boundary layer ozone data from the EUROTRAC TOR network
.
J. Geophys. Res.
,
109
,
D02302
,
doi:10.1029/2003JD003727
.
Debret
,
M.
, and
Coauthors
,
2007
:
The origin of the 1500-year climate cycles in Holocene North-Atlantic records
.
Climate Past
,
3
,
569
575
.
Gangoiti
,
G.
,
M. M.
Millan
,
R.
Salvador
, and
E.
Mantilla
,
2001
:
Long-range transport and re-circulation of pollutants in the Western Mediterranean during the project regional cycles of air pollution in the West-Central Mediterranean area
.
Atmos. Environ.
,
35
,
6267
6276
.
Grinsted
,
A.
,
J. C.
Moore
, and
S.
Jevrejeva
,
2004
:
Application of the cross wavelet transform and wavelet coherence to geophysical time series
.
Nonlinear Processes Geophys.
,
11
,
561
566
.
Hasselmann
,
K.
,
1976
:
Stochastic climate models part I
.
Tellus
,
28
,
473
485
.
Hocke
,
K.
, and
N.
Kämpfer
,
2009
:
Gap filling and noise reduction of unevenly sampled data by means of the Lomb-Scargle periodogram
.
Atmos. Chem. Phys.
,
9
,
4197
4206
.
Horne
,
J. H.
, and
S. L.
Baliunas
,
1986
:
A prescription for period analysis of unevenly sampled time series
.
Astrophys. J.
,
302
,
757
763
.
Lanfredi
,
M.
,
T.
Simoniello
,
V.
Cuomo
, and
M.
Macchiato
2009
:
Discriminating low frequency components from long range persistent fluctuations in daily atmospheric temperature variability
.
Atmos. Chem. Phys.
,
9
,
4537
4544
.
Lau
,
K-M.
, and
H.
Weng
,
1995
:
Climate signal detection using wavelet transform: How to make a time series sing
.
Bull. Amer. Meteor. Soc.
,
76
,
2391
2402
.
Lomb
,
N. R.
,
1976
:
Least-squares frequency analysis of unequally spaced data
.
Astrophys. Space Sci.
,
39
,
447
462
.
Matsoukas
,
C.
,
S.
Islam
, and
I.
Rodriguez-Iturbe
,
2000
:
Detrended fluctuation analysis of rainfall and streamflow time series
.
J. Geophys. Res.
,
105
(
D23
),
29 165
29 172
.
Millán
,
M. M.
,
B.
Artiñano
,
L.
Alonso
,
M.
Castro
,
R.
Fernandez-Patier
, and
J.
Goberna
,
1992
: Meso-Meteorological Cycles of Air Pollution in the Iberian Peninsula (MECAPIP). Air Pollution Research Rep. 44, EUR 14834, 219 pp. [Available from European Commission DG XII/E-1, Rue de la Loi, 200, B-1040, Brussels.]
Millán
,
M. M.
,
R.
Salvador
,
E.
Mantilla
, and
B.
Artiñano
,
1996
:
Meteorology and photochemical air pollution in Southern Europe: Experimental results from EC research projects
.
Atmos. Environ.
,
30
,
1909
1924
.
Millán
,
M. M.
,
R.
Salvador
,
E.
Mantilla
, and
G.
Kallos
,
1997
:
Photooxidant dynamics in the Mediterranean basin in summer: Results from European research projects
.
J. Geophys. Res.
,
102
(
D7
),
8811
8823
.
Millán
,
M. M.
,
E.
Mantilla
,
R.
Salvador
,
A.
Carratalá
,
M. J.
Sanz
,
L.
Alonso
,
G.
Gangoiti
, and
M.
Navazo
,
2000
:
Ozone cycles in the western Mediterranean basin: Interpretation of monitoring data in complex coastal terrain
.
J. Appl. Meteor.
,
39
,
487
507
.
Millán
,
M. M.
,
M. J.
Estrela
, and
J.
Miró
,
2005
:
Rainfall components: Variability and spatial distribution in a Mediterranean area (Valencia region)
.
J. Climate
,
18
,
2682
2705
.
Moussiopoulos
,
N.
, and
Coauthors
,
2005
: Meteorological aspects of air pollution episodes in southern European cities. Meteorology Applied to Urban Air Pollution Problems, Final Rep. COST Action 715, B. Fisher et al., Eds., Demetra Ltd., 119–134.
Orlanski
,
I.
,
1975
:
A rational subdivision of scales for atmospheric processes
.
Bull. Amer. Meteor. Soc.
,
56
,
527
530
.
Palau
,
J. L.
,
M. M.
Millán
, and
G.
Pérez-Landa
,
2005a
: Contribution to background air quality at regional and global scales. Report of ACCENT-T&TP Barnsdale Expert Meeting: Frontiers in Transport and Transformation of Pollutants, P. Monks and P. Borrell, Eds., ACCENT, 57–60.
Palau
,
J. L.
,
G.
Pérez-Landa
,
J. J.
Diéguez
,
C.
Monter
, and
M. M.
Millán
,
2005b
:
The importance of meteorological scales to forecast air pollution scenarios on coastal complex terrain
.
Atmos. Chem. Phys.
,
5
,
2771
2785
.
Palau
,
J. L.
,
G.
Pérez-Landa
, and
M. M.
Millán
,
2009
:
Transitional dispersive scenarios driven by mesoscale flows on complex terrain under strong dry convective conditions
.
Atmos. Chem. Phys.
,
9
,
119
131
.
Press
,
W.
, and
G. B.
Rybicki
,
1989
:
Fast algorithm for spectral analysis of unevenly sampled data
.
Astrophys. J.
,
338
,
277
280
.
Pryor
,
S. C.
, and
D. G.
Steyn
,
1995
:
Hebdomadal and diurnal cycles in ozone time series from the Lower Fraser Valley
.
Atmos. Environ.
,
29
,
1007
1019
.
Salvador
,
R.
,
J.
Calbó
, and
M. M.
Millán
,
1999
:
Horizontal grid size selection and its influence on mesoscale model simulation
.
J. Appl. Meteor.
,
38
,
1311
1329
.
Scargle
,
J. D.
,
1982
:
Studies in astronomical time series analysis II. Statistical aspects of spectral analysis of unevenly sampled data
.
Astrophys. J.
,
263
,
835
853
.
Sebald
,
L.
,
R.
Treiffeissen
,
E.
Reimer
, and
T.
Hies
,
2000
:
Spectral analysis of air pollutants. Part 2: Ozone time series
.
Atmos. Environ.
,
34
,
3503
3509
.
Torrence
,
C.
, and
G. P.
Compo
,
1998
:
A practical guide to wavelet analysis
.
Bull. Amer. Meteor. Soc.
,
79
,
61
78
.