• Arguez, A., Bourassa M. A. , and O’Brien J. J. , 2005: Extraction of the MJO signal from QuikSCAT. J. Atmos. Oceanic Technol., 22 , 18851894.

  • Bloomfield, P., 2000: Fourier Analysis of Time Series—An Introduction. 2nd ed. Wiley, 261 pp.

  • Ghil, M., and Coauthors, 2002: Advanced spectral methods for climatic time series. Rev. Geophys., 40 .1003, doi:10.1029/2000RG000092.

  • Hamming, R. W., 1989: Digital Filters. 3rd ed. Prentice Hall, 284 pp.

  • Madden, R. A., and Julian P. R. , 1971: Detection of a 40–50 day oscillation in the zonal wind in the tropical Pacific. J. Atmos. Sci., 28 , 702708.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mann, M. E., 2004: On smoothing potentially non-stationary climate time series. Geophys. Res. Lett., 31 .L07214, doi:10.1029/2004GL019569.

    • Search Google Scholar
    • Export Citation
  • Park, J., 1992: Envelope estimation for quasi-periodic geophysical signals in noise: A multitaper approach. Statistics in the Environmental and Earth Sciences, A. T. Walden and P. Guttorp, Eds., Halsted Press, 189–219.

    • Search Google Scholar
    • Export Citation
  • Pegion, P. J., Bourassa M. A. , Legler D. M. , and O’Brien J. J. , 2000: Objectively derived daily “winds” from satellite scatterometer data. Mon. Wea. Rev., 128 , 31503168.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schoof, J. T., Arguez A. , Brolley J. , and O’Brien J. J. , 2005: A new weather generator based on spectral properties of surface air temperatures. Agric. For. Meteor., 135 , 241251.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Reynolds R. W. , 2003: Extended reconstruction of global sea surface temperatures based on COADS data (1854–1997). J. Climate, 16 , 14951510.

    • Search Google Scholar
    • Export Citation
  • Thompson, D. W. J., and Wallace J. M. , 1998: The Arctic Oscillation signature in the wintertime geopotential height and temperature fields. Geophys. Res. Lett., 25 , 12971300.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Von Storch, H., and Zwiers F. W. , 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.

  • Wheeler, M. C., and Hendon H. H. , 2004: An all-season real-time multivariate MJO index: Development of an index for monitoring and prediction. Mon. Wea. Rev., 132 , 19171932.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • View in gallery

    ENSO filter series. (top) ENSO time series utilized and (bottom) the FRF of the filter used on it.

  • View in gallery

    AO filter series. As in Fig. 1, but for the AO.

  • View in gallery

    MJO filter series. (top) The zonal velocity time series, (middle) the filter weights used, and (bottom) the associated FRF.

  • View in gallery

    Subset analysis schematic. A contrived 24-point time series is utilized for a 5-month running mean. Green squares represent input values solely used to compute the “true” filtered values. Blue diamonds indicate the estimated output values. Gray circles indicate interior points unaffected by endpoint issues because complete convolutions can be computed.

  • View in gallery

    ENSO root-mean-square errors. RMSEs as a function of position for the unconstrained (purple), equal-mean (green), equal-variance (blue), and spectral (red) methods for the AO project (top). A liberal skill threshold based on random filter output has an RMSE value of about 0.63.

  • View in gallery

    ENSO endpoint variance ratio. Variance of estimates as a function of endpoint location for the ENSO case, scaled by the variance of the “true” values.

  • View in gallery

    AO RMS errors. As in Fig. 5, but for the (top) AO project (skill threshold equals 0.4). (bottom) The least squares techniques’ RMSE values divided by the spectral RMSE values.

  • View in gallery

    AO endpoint variances. As in Fig. 6, but for the AO case.

  • View in gallery

    MJO RMS errors. As in Fig. 7, but for the MJO case (skill threshold equals 71.2).

  • View in gallery

    MJO endpoint variances. As in Fig. 6, but for the MJO case.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 304 197 0
PDF Downloads 258 172 0

A New Method for Time Series Filtering near Endpoints

View More View Less
  • 1 NOAA/National Climatic Data Center, Asheville, North Carolina
  • | 2 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California
  • | 3 Center for Ocean–Atmospheric Prediction Studies, Tallahassee, Florida
Full access

Abstract

Time series filtering (e.g., smoothing) can be done in the spectral domain without loss of endpoints. However, filtering is commonly performed in the time domain using convolutions, resulting in lost points near the series endpoints. Multiple incarnations of a least squares minimization approach are developed that retain the endpoint intervals that are normally discarded due to filtering with convolutions in the time domain. The techniques minimize the errors between the predetermined frequency response function (FRF)—a fundamental property of all filters—of interior points with FRFs that are to be determined for each position in the endpoint zone. The least squares techniques are differentiated by their constraints: 1) unconstrained, 2) equal-mean constraint, and 3) an equal-variance constraint. The equal-mean constraint forces the new weights to sum up to the same value as the predetermined weights. The equal-variance constraint forces the new weights to be such that, after convolved with the input values, the expected time series variance is preserved. The three least squares methods are each tested under three separate filtering scenarios [involving Arctic Oscillation (AO), Madden–Julian oscillation (MJO), and El Niño–Southern Oscillation (ENSO) time series] and compared to each other as well as to the spectral filtering method—the standard of comparison. The results indicate that all four methods (including the spectral method) possess skill at determining suitable endpoints estimates. However, both the unconstrained and equal-mean schemes exhibit bias toward zero near the terminal ends due to problems with appropriating variance. The equal-variance method does not show evidence of this attribute and was never the worst performer. The equal-variance method showed great promise in the ENSO project involving a 5-month running mean filter, and performed at least on par with the other realistic methods for almost all time series positions in all three filtering scenarios.

Corresponding author address: Dr. Anthony Arguez, NCDC, CAB 514, 151 Patton Avenue, Asheville, NC 28801. Email: anthony.arguez@noaa.gov

Abstract

Time series filtering (e.g., smoothing) can be done in the spectral domain without loss of endpoints. However, filtering is commonly performed in the time domain using convolutions, resulting in lost points near the series endpoints. Multiple incarnations of a least squares minimization approach are developed that retain the endpoint intervals that are normally discarded due to filtering with convolutions in the time domain. The techniques minimize the errors between the predetermined frequency response function (FRF)—a fundamental property of all filters—of interior points with FRFs that are to be determined for each position in the endpoint zone. The least squares techniques are differentiated by their constraints: 1) unconstrained, 2) equal-mean constraint, and 3) an equal-variance constraint. The equal-mean constraint forces the new weights to sum up to the same value as the predetermined weights. The equal-variance constraint forces the new weights to be such that, after convolved with the input values, the expected time series variance is preserved. The three least squares methods are each tested under three separate filtering scenarios [involving Arctic Oscillation (AO), Madden–Julian oscillation (MJO), and El Niño–Southern Oscillation (ENSO) time series] and compared to each other as well as to the spectral filtering method—the standard of comparison. The results indicate that all four methods (including the spectral method) possess skill at determining suitable endpoints estimates. However, both the unconstrained and equal-mean schemes exhibit bias toward zero near the terminal ends due to problems with appropriating variance. The equal-variance method does not show evidence of this attribute and was never the worst performer. The equal-variance method showed great promise in the ENSO project involving a 5-month running mean filter, and performed at least on par with the other realistic methods for almost all time series positions in all three filtering scenarios.

Corresponding author address: Dr. Anthony Arguez, NCDC, CAB 514, 151 Patton Avenue, Asheville, NC 28801. Email: anthony.arguez@noaa.gov

1. Introduction

Filtering (e.g., smoothing) is an essential tool for analyzing geophysical time series. Common applications include extraction of salient time scales, suppression of high-frequency noise (i.e., smoothing), and smoothing spectra in order to increase statistical confidence in spectral estimates. Time series filtering can be done in the frequency or time domains. Consider an input time series x(t) that is to be filtered to produce y(t), the output time series. Filtering in frequency space is accomplished by 1) applying a fast Fourier transform (FFT) to the time series (x), 2) multiplying by the frequency response function (FRF; H), and 3) back-transforming into time space:
i1520-0426-25-4-534-e1
i1520-0426-25-4-534-e2
i1520-0426-25-4-534-e3
We term this method the “spectral” method. As can be seen in (2), the FRF regulates the transfer of “spectral energy” from the input to the output series (the raw spectrum is proportional to the FFT of a time series multiplied by its conjugate). The FRF is defined using filter weights denoted as h(τ) as follows:
i1520-0426-25-4-534-e4
Typically, however, time series filtering is computed in time space. This requires a convolution between the input time series and the filter weights:
i1520-0426-25-4-534-e5
The filter weights are determined a priori by the user based on the spectral characteristics one wants to transfer to the output [for determining filter weights and additional background on digital filters, see Hamming (1989)]. In particular, we assume the filter is an acausal nonrecursive digital filter. The parameters a and b are integers that are usually chosen such that a is the negative of b. Thus, the number of filter weights becomes 2b + 1, resulting in an equal number of weights on each side of the central point (τ = 0). The parameter τ represents a time lag.

Filtering in the time domain results in lost points in the left and right endpoint intervals; this is a consequence of the convolution. Specifically, the convolution cannot be defined at the first b points and the last b points of the time series. In these regions, at least one of the lags is associated with an unavailable point in the time series (points beyond the terminal values). These points where the full convolution cannot be computed are customarily dropped from consideration. This is unfortunate because the most recent values (near the right endpoint) are lost, precluding effective real-time monitoring in situations where filtering is mandatory.

For example, take the simple case of a 5-month running mean [h(τ) = 0.2] to be computed on a time series consisting of monthly values from January 1997 through December 2006:
i1520-0426-25-4-534-e6
The earliest filtered value that can be computed is for March 1997, resulting from the average of the values from January 1997 to May 1997. Similarly, the last filtered value that can be computed is for October 2006. The November and December 2006 filtered values cannot be computed precisely until subsequent data become available. Therefore, indices that are constructed by using an acausal nonrecursive filter lag real time by at least b time units (e.g., months), depending on whether the underlying data are available in real time. Note that the delay is directly proportional to the number of filter weights, making real-time monitoring of bandpass filtered series, such as the Madden–Julian oscillation (MJO), a challenging problem (Wheeler and Hendon 2004). This leads to the following question: how can, in the above example, the November and December 2006 filtered values be estimated without knowing the future values necessary to compute their exact values?

In the present study, a technique to estimate endpoint values is presented that is based on the FRF of an a priori filter. Note that the technique is based on the FRF, making it a generalized method for all filtering schemes. Variable filter weights in the endpoint intervals are determined such that the FRF optimally (in the least squares sense) reproduces the predetermined FRF of the interior points. Three different constraints are applied: an equal-mean constraint, an equal-variance constraint, and unconstrained (i.e., no constraint). The equal-mean constraint forces the newly determined weights in the endpoint interval to sum up to the same value as the interior filter weights, preserving the expected mean of the output series. Utilizing the relationship between variance and the raw spectrum, the equal-variance method forces the output time series variance to be very close to the expected variance. (Subroutines of these techniques are available via ftp://ftp.ncdc.noaa.gov/pub/data/aarguez in the IDL programming language.) The techniques are tested on daughter time series (created using a Monte Carlo technique) of three well-known climate modes: the El Niño–Southern Oscillation (ENSO), the Arctic Oscillation (AO), and the MJO. For the ENSO project, empirically determined filter weights are also presented, representing a minimum bound for estimation error.

In addition to comparisons with the spectral filtering method, the least squares schemes are compared to the methods described in Mann (2004, hereafter Ma04) for applying boundary constraints when smoothing nonstationary geophysical time series (e.g., global-averaged surface temperature). Ma04 utilized three methods that minimize the zeroth, first, and second derivatives of smoothed time series near their boundaries: the minimum norm, minimum slope, and minimum roughness constraints, respectively. Instead of applying these constraints formally in the frequency domain (see Park 1992; Ghil et al. 2002), Ma04 used suitable approximations in the time domain. For the “norm” constraint, the series is extended with the mean value of the interior points. The “slope” constraint is applied by extending the time series symmetrically beyond its termini. The “rough” constraint is also extended symmetrically, but then it is also flipped symmetrically over the horizontal line through the terminal point, approximating the sign change in concavity required for minimizing a second derivative. Note that the Ma04 approach is based on applying the same filter weights to extended time series, whereas the least squares methods proposed herein are based on changing the filter weights and not utilizing any time series values (actual or projected) beyond the termini. While the three methods in Ma04 were applied specifically to smoothing (a specific type of filtering) on nonstationary climate time series (e.g., a series with a prominent trend at the end of the time series), they are included in the present investigation for the sake of comparison and completeness.

The data and methods are detailed in section 2, including a discussion on generating multiple samples and the least squares technique. It is followed by the results of the analyses in section 3. Concluding remarks are offered in section 4.

2. Data and methods

a. Sample time series

The present investigation uses three climate-mode time series: an ENSO, an AO, and an MJO time series. These time series were chosen because of their prominence in climate research, as well as to reinforce the notion that the least squares techniques are applicable to infinite filtering scenarios. ENSO indices are computed in numerous different ways. However, this is ordinarily accomplished by averaging sea surface temperature (SST) anomalies over a given domain box in the tropical Pacific. A commonly used indicator region is the Niño-3.4 region (5°S–5°N, 120°–170°W). Monthly values from the extended reconstructed sea surface temperature (ERSST) dataset (see Smith and Reynolds 2003), binned into 2° by 2° grid boxes, from 1946 to 2000 are utilized to compute the box averaged time series for the Niño-3.4 region (Fig. 1a). The most common filters applied to SST-based ENSO indices are running means of 3 or 5 months. In the present investigation, a 5-month running average is used (the FRF is shown in Fig. 1b).

The AO is defined using empirical orthogonal function (EOF) analysis (see Thompson and Wallace 1998). The AO time series is defined as the leading principal component (PC) time series of National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalysis sea level pressure (SLP) north of 20°N from 1948 to 2002 (Fig. 2a). Since the AO definition utilizes EOF analysis and does not make any assumptions on the time scale, the time series has spectral energy spread across the frequency domain. Therefore, filtering of the AO time series is utilized to extract the component of AO variability within a particular frequency range. In the present investigation, our AO filter will be a low-pass filter comprising 28 consecutive passes of the 1–2–1 filter (see Von Storch and Zwiers 1999), which is also known as the von Hann filter or simply as a “Hanning.” This results in a 57-point Gaussian filter; the FRF is shown in Fig. 2b.

Unlike the AO and ENSO, the MJO is unique because its time scale is its defining characteristic. Therefore, MJO time series are customarily defined postfiltering. Alternatively, the MJO can be defined using EOF-type analysis of nonfiltered data (see Arguez et al. 2005). The MJO was first observed in station time series (Madden and Julian 1971) using spectral analysis. In the present investigation, a time series must be obtained that reveals intraseasonal variability after bandpass filtering. Analogous to the pioneering work of Madden and Julian (1971), this is accomplished by using a point source in the MJO domain. The time series is obtained from a gridded Quick Scatterometer (QuikSCAT) zonal velocity field (see Pegion et al. 2000) at the equator and 80°E (Fig. 3a). The time series comprises 392 pentad averages from late July 1999 through early December 2004. Based on the results from Arguez et al. (2005), 31 Lanczos weights are utilized (see FRF in Fig. 3b).

b. Generating multiple samples

Investigating the impact of the endpoint schemes on one climate time series may provide anecdotal evidence of its efficacy, but a significantly larger sample size is necessary to increase confidence in the filtering method. On the other hand, using a large number of purely random time series may help in terms of increased confidence, but it would also be beneficial to create daughter time series that, in the spectral average sense, are acceptably similar to the mother time series. To accomplish both goals, a manipulation of white noise is utilized (see Schoof et al. 2005). For each of the three climate indices, the FFT is computed. The resulting function will be utilized as the FRF to be imposed on white-noise time series.

Ten thousand randomly generated, uniformly distributed time series are created, representing white noise. Each new sample is passed through an FFT and multiplied by the FRF before inverse Fourier transforming back into the time domain. Finally, the 10 000 daughter time series are normalized using each individual mean and standard deviation. This results in 10 000 daughter time series that are representative of the original climate series (e.g., the ENSO time series). Note that this protocol is analogous to filtering in spectral space except that, here, the FRF is used to impose, rather than regulate, the spectral characteristics of the output time series. Therefore, the only difference is in scientific intent; the mathematics is completely identical. Although white noise is utilized to create these time series, it must be emphasized that the resulting daughter time series are not representative of white noise, since the “red” spectral properties have been imparted on them. Using these daughter time series, the various endpoint estimate methods are assessed by computing the root-mean-square errors (RMSEs) for each position in the endpoint interval.

c. The least squares filtering schemes

As stated earlier, the problem with filtering in the time domain is that the full convolution cannot be computed at the endpoints of x(t). This leads to the following question: can variable-length filter weights be determined in the endpoint intervals such that the signal extraction replicates that observed in the interior points? In other words, how can an incalculable acausal filter (a filter that depends on past and future values) be replaced with a causal filter (a filter that depends on past and current values only) for endpoint estimation? For this project, a penalty function is constructed to minimize the squared error between the FRF in the interior of the time series, and a new FRF in each point of the endpoint intervals. Following (4) the FRF of interior points are denoted as follows:
i1520-0426-25-4-534-e7
where α represents the new filter weights to be determined. The value of b is incrementally decreased to force a symmetric filter at each point. Consider once again a 5-month running average. For the second and penultimate (i.e., second to last) point in the time series (b = 1), three weights are determined. For the first point and the last point of the time series, only one weight is determined (b = 0). The result is a set of filter weights for each point in the endpoint interval. Without constraints, this minimization technique truncates the a priori filter weights (see Bloomfield 2000). In the case of the 5-month running average, the second and penultimate points would be computed using [1/5, 1/5, 1/5] as the three weights, whereas the first and last points in the time series would simply equal one-fifth the input value. A practical constraint is to force the new weights to sum to 1 (see Arguez et al. 2005), thereby preserving the expectation of the mean of the output series. The cost function for a given point L in the endpoint interval is a function of α and λ (the Lagrange multiplier that imposes our constraint):
i1520-0426-25-4-534-e8
The minimum of this penalty function is obtained by taking the partial derivatives with respect to each α and λ and setting them equal to zero:
i1520-0426-25-4-534-e9
i1520-0426-25-4-534-e10
Equations (9) and (10) are solved simultaneously using simple matrix operations to arrive at the coefficients. We term this method the equal-mean method. Using the 5-month running mean as an example once again, the three weights for computing the second and penultimate values would each be 1/3, whereas the first and last values remain unchanged (αo = 1; b = 0). It can be shown algebraically that the resulting weights from the equal-mean constraint for a running-mean filter, regardless of the length of the a priori filter, will always be the average over the filtering interval (e.g., 1/3, 1/5, 1/7, 1/9, etc.).
An alternative, albeit more complicating, constraint is to force the total variance to be preserved. Variance is related to the spectrum via the following relation:
i1520-0426-25-4-534-e11
By definition, time series filtering alters the variance of an input signal, and the change in variance is controlled by the filter weights (i.e., the FRF). Typically, time series filtering (e.g., smoothing) results in a (sometimes drastic) reduction in variance. This is somewhat intuitive, if one considers that filtering is often employed to “smooth out” outliers, in effect displacing values toward the mean, thereby reducing the time series variance. Therefore, there is a natural desire to ensure that endpoint estimates account for the deliberate effect that filtering has on time series variance. Combining (11) with (2) and discretizing, the following equality will be imposed to ensure that total variance in the interior will match the total variance of each point in the endpoint interval:
i1520-0426-25-4-534-e12
The asterisks indicate complex conjugates. Note that the constraint is dependent on the input time series (specifically, its raw spectrum), unlike all other methods employed in the present study. Modifying (8) to impose the new constraint and incorporating the definition in (12), results in the new cost function:
i1520-0426-25-4-534-e13
where T is the variance of interior points and is proportional to the left-hand side of (12). As before, the minimum is found by partial differentiation and setting to 0:
i1520-0426-25-4-534-e14
i1520-0426-25-4-534-e15
These equations are not solvable with simple matrix manipulation because of nonlinearities between the Lagrange multiplier and the coefficients in (14) and amongst the coefficients in (15). The equations are solved using the Newton method available in the IDL programming language. We term this method the equal-variance method. Note that, for the terminal points in the series, the output values are computed by scaling the original termini by the standard deviation of the interior points, since the expected variance is preserved. Since the standard deviation depends on the input time series, the weights for the equal-variance method will generally vary from one sample time series to another. For the unconstrained and equal-mean methods, the weights do not depend on the input time series, and they are therefore uniform for each and every input time series.

d. Test methodology

The equal-mean, equal-variance, and unconstrained methods will be applied to subsets of time series of ENSO, the AO, and the MJO. These “estimates” will be compared to the “true” filtered values. Please consult Fig. 4, which shows a contrived 24-point time series that helps explain the test methodology. The top series represents the input time series to be filtered with a 5-month running mean (b = 2). The bottom series shows the output time series. The diamonds represent the estimation zone. The green squares are only used to compute “true” values; these represent the points in time that are not available to compute the full convolution. For example, values 20–24 are averaged to compute the true value for position 22. Positions 23 and 24 represent future data points that cannot be accounted for when estimating the filtered value for position 22, which is indicated by the rightmost blue diamond. The gray circles represent the “interior points” that are not subject to estimation because full convolutions can be computed for these positions. As indicated by line drawings, the output value for position 12 is computed by averaging values 10–14. Note that, as stated earlier, the estimates are computed using filter arrays that are symmetric, resulting in three weights to be determined to compute output values for positions 4 and 21, and one weight only for the terminal ends (positions 3 and 22).

A similar procedure is followed for the 31-point Lanczos weights (b = 15) in the MJO project, as well as the 57-point Gaussian weights (b = 28) in the AO project. For the MJO project, the series are extended by 15 points on each end (there would be 15 green squares on each side of the time series). For the AO project, the series are extended by 28 points on both ends. Note that the parameter b determines the number of extra points needed in one direction (left or right end), and that 2b + 1 is the length of the filter weights array.

To assess the viability of the least squares filtering schemes, the RMSE is computed at each point in the estimation zones. Since the methodology does not impart any preference to the left or the right estimation zones, the RMSE values should be symmetric between the left and right zones. In addition, the RMSE values will also be computed for the spectral filtering method (the standard of comparison) and the three techniques used in Ma04. Separate RMSE figures are computed for each of the filtering scenarios: the AO, the MJO, and ENSO projects. To infer whether the various results are accurately appropriating variance, the variance of the output array (the 10 000 simulations) will be computed for each point in the estimation zones, and then compared to the “true” variances (the variance of the 10 000 “true” values for each point in the estimation zone). By definition, the equal-variance constraint should produce output whose overall variances are close to the true variances. Note that the filtering results are not shown for the mother time series, since results from such a small sample size (N = 1) are not robust.

The errors due to the least squares methods increase closer to the termini due to two primary estimation errors. First, fewer data are available for the computation (data beyond the terminal ends are blindly disregarded). These random errors are not absolutely quantifiable, since there are no restrictions on the particular values a time series can take beyond the endpoints. Second, the ability to reproduce the FRF is hampered by a reduced number of allowed filter weights. This second source of error is especially problematic for bandpass filters, such as the Lanczos filter used in the MJO project, since the weights must include both positive and negative weights to extract salient time scales (Arguez et al. 2005).

As a final analysis, empirical weights are determined for the ENSO project. The entire 1946–2000 period, save a few months near the termini, is used to determine the weights that minimize the RMSE as a filter of the following form proceeds throughout the period of record:
i1520-0426-25-4-534-e16
Note that (16) applies only to the right endpoint. For the penultimate point, the following filter is used:
i1520-0426-25-4-534-e17
The coefficients are determined to the thousandths place. The resulting empirically determined filters represent a lower bound on the RMSE that the theoretically based filtering schemes (the least squares methods, the spectral method, and the Ma04 constraints) can attain.

3. Results

a. ENSO

In this project, domain-averaged SST is used to define the ENSO time series. Unlike other ENSO indicators, such as the noisy Southern Oscillation index (a pressure difference between Darwin, Australia; and Tahiti), SST-based indices have disproportionately lesser amounts of high-frequency variability, attributable to the high heat capacity and inertia of the ocean. Therefore, SST indices have much longer decorrelation times than atmospheric indices; longer decorrelation times, in turn, are linked to potential predictability.

For the ENSO project, the equal-variance method performs the best, followed closely by the slope, equal-mean, and rough methods (Fig. 5). The RMSE for the equal-variance method’s penultimate and terminal points are 0.118 and 0.242, respectively. The three second-tier performers all slightly overestimate the variance (Fig. 6), whereas the equal-variance method results have a variance close to that of the “true” results. The three remaining schemes (spectral, norm, and unconstrained) have even larger RMSE values and underestimate the variance. The rough and equal-mean methods perform as well as they do because the ENSO time series has a large decorrelation time and the filter width is very small (b = 2). On the other hand, the norm and unconstrained methods underperform because of their tendencies to approach zero. Note the level to which the equal-variance method outperforms the spectral method, which is utilized in this study as the standard of comparison. The equal-variance method generally outperforms the spectral method for the interior points of the estimation zone, but outperforms for all points in the estimation zone when the filter width is not very large.

In this project, only two estimates are computed per times series end for each filtering method tested. For the unconstrained case, all the weights remain at 0.2 since this method simply truncates the weights. For the equal mean case, the weights become 1/3 when computing the second and penultimate points, and 1 for computing the terminal ends. Therefore, the unconstrained case tends to result in values that veer toward 0 near the terminal ends, as implied in Fig. 6. On the other hand, many of the other methods remain close to the original prefiltered values, explaining the relative success of the equal-mean scheme over the unconstrained scheme.

The empirically determined weights are presented in Table 1. The associated RMSE values are 0.155 for the termini and 0.073 for the second/penultimate points. This compares to RMSE values of 0.242 and 0.118, respectively, for the best performing theoretically based scheme (the equal-variance method). The “true” filtered values result in a variance reduction of 14.1% (compared to the input time series’ variance). The empirical filters (16) and (17) reduce variance by 17.1% and 14.7%, respectively. Note that the variance ratios are closer to unity than all other schemes except the equal-variance method (Fig. 6), suggesting that the proper appropriation of variance is a key component in accurate endpoint estimation.

b. AO

The RMS errors for the AO filtering project suggest that all seven filtering schemes work very well in the interior of the estimation zone (Fig. 7). The expected RMSE between filtered values of randomly selected input series (using the same procedures as above on scaled white noise series) is about 0.4, representing a liberal skill threshold. In other words, the liberal skill threshold is an upper bound of skill based on purely random time series (as opposed to the daughter time series utilized for determining RMSE). Except for the equal-mean and the roughness methods, the filter schemes show skill for all endpoint positions. The top performers (in terms of RMSE) are the slope and norm schemes. Intermediate performers include the equal-variance, spectral (within a quarter filter width of the termini), and unconstrained methods. The poorest performers are the equal-mean and rough schemes.

The norm scheme tends toward the mean value at the boundary. This results in a tendency toward zero for climate oscillation anomalies, yielding reduced variance (damping) at the endpoints (Fig. 8). The slope scheme tends toward a constant value at the boundary, resulting in overestimated variance near the termini for the AO case. This is undoubtedly due primarily to instances where the constant value approached is large in an absolute value sense (e.g., when the original series endpoints are near absolute extrema). The unconstrained method suffers a similar fate. Recall that the unconstrained filters are truncated—the outer weights are replaced with zeroes, resulting in sharply reduced variance. In the AO case, the unconstrained method produced respectable results in an RMSE sense, but as shown in the ENSO case, the relative success is not robust, but due to the nature of the filtering problem in the AO case. In summary, although the norm, slope, and unconstrained methods have some success in an RMSE sense for this case, they are unreliable methods if representative output time series are desirable.

Both the spectral and equal-variance methods show skill for all grid points, with the equal-variance method outperforming for most of the interior points in the estimation zone, and the spectral method outperforming slightly within one-eighth filter width of the termini. Both methods do a respectable job in appropriating variance as well. The rough and equal-mean constraints perform very poorly for the AO case. The equal-mean case is handicapped by forcing the weights to sum to unity, causing gross overestimates (much larger variance) near the termini. The rough constraint is not designed for trend-free climate oscillations (i.e., stationary time series), and it is therefore not surprising that it underperforms in all three cases.

It is important to comment on the relative success and failure of the unconstrained and equal-mean filters near the terminal ends, respectively. The equal-mean filter suffers in part because the sum of the filter weights used to compute all three interior FRF functions (H) was set to 1 (this can be changed and will be considered in future work). This implies the two terminal values do not change because they are simply multiplied by 1. As a result, when the input subset ends at a series maximum or minimum, the errors with the equal-mean method are disproportionately large. The unconstrained method, on the other hand, performs very well (as shown in the ENSO project, this is not a robust result). However, here we have somewhat of the opposite issue occurring. Instead of constraining the weights to equal a certain value or maintaining a variance, these weights are simply truncated. This results in decreased variance close to the terminal ends. In the case of the AO project, this reduction in variance proved to be to an appropriate level to allow strong performance—this is no more than a coincidence based on the specifics of the filtering parameters. However, if the level of variance reduction is not appropriate, large errors can result (as in the ENSO project).

c. MJO

Filter performance in the MJO project is erratic (Fig. 9). As in the AO case, the lowest RMSE values are found in the norm and slope schemes, yet the variance ratios are neither stable nor close to unity for all endpoint positions (Fig. 10). In fact, the variance ratios appear to be affected by the MJO’s 30–60-day time scale. Variance ratios close to unity throughout the estimation range are only evident in the equal-variance and the spectral method. RMSE values of the equal-variance method are lower than those of the spectral method except for the four points closest to the termini (inclusive). The liberal skill threshold in this case is about 70, suggesting that all methods are useful for all time series positions. Looking at typical results (not shown), the issue of partitioning variance at the terminal ends proves to be problematic for the equal-mean and unconstrained methods, with both evincing outputs that tend to veer toward zero at the terminal points. In other words, both methods are plagued by reduced variance toward the terminal points. The equal-variance method’s RMSE are similar near the endpoints to the equal-mean and unconstrained methods, but it does not exhibit the veering to zero characteristic: the equal-variance method results are neither partial to overshooting nor undershooting the true values.

4. Summary and conclusions

In the present investigation, different options were explored for retaining time series endpoints when filtering. The spectral method, three variations of a least squares technique, and the three boundary constraints in Ma04 were utilized. As one would expect, the resulting errors are typically greater for the positions closest to the terminal ends. The most consistent least squares technique was the equal-variance method, which is also the most computationally expensive. By imposing consistent appropriation of variance in the estimation zones, this method was not prone to over- or undershooting the true output, an attribute that was present to certain degrees with all other methods. In addition, the equal-variance method is unique because it is customized to a particular time series input through the constraint. Overall, the equal-variance scheme was arguably at least on par with the others (including the spectral method) for most endpoint positions in the AO and MJO projects, and was clearly the best performer for the ENSO study. It is the authors’ contention that the equal-variance method is a suitable alternative to simply dropping endpoints, a common practice in the geosciences.

Although the Ma04 constraints, particularly the rough constraint, are not necessarily designed for presumably stationary time series (ENSO, AO, and MJO time series), the techniques were included herein for completeness. As shown here and hinted in Ma04, the slope and norm constraints are prone to under- or overshooting, while the rough constraint is prone to dramatic overshooting when there is not a prominent trend at the end of a time series. Future work should be dedicated to assessing the slope threshold at which a method like the equal-variance scheme should be replaced by the rough constraint. In other words, time series parameters should guide what type of endpoint estimation scheme is utilized since, for instance, the rough constraint would far exceed the performance of all other schemes presented here for the case of globally averaged temperature.

Wheeler and Hendon (2004) point out that the major obstacle of real-time monitoring and prediction of the MJO, for example, is the loss of endpoints due to bandpass filtering. This is an inherent problem when utilizing acausal digital filters. In the case of real-time monitoring of any climate time series, the equal-variance method provides a useful “first guess” without having to wait for future values to be recorded. In this regard, the equal-variance filtering scheme is an attractive alternative to the more indirect methods used for real-time monitoring (e.g., see Wheeler and Hendon 2004), which rely on regressing data on predetermined, fixed spatial patterns. However, it should be noted that, in the case of the MJO, the spectral method is also a suitable alternative, especially very close to the termini. How the methods described in the present investigation compare to Wheeler and Hendon’s (2004) EOF-based monitoring algorithm is worthy of a detailed inspection.

It is important to note that the techniques described herein do not involve time series prediction in any manner. Future work may involve combining the least squares technique with a statistical prediction method that takes on increasing weight closer to the termini. Simple autoregressive forecasting (and hindcasting) analyses showed promise near the endpoints in preliminary analyses, both when extending the output (filtered) series with a forecast, as well as forecasting the input (prefiltered) series in order to compute a full convolution. However, for the common applications where no prediction error is acceptable, the equal-variance method is worthy of consideration.

REFERENCES

  • Arguez, A., Bourassa M. A. , and O’Brien J. J. , 2005: Extraction of the MJO signal from QuikSCAT. J. Atmos. Oceanic Technol., 22 , 18851894.

  • Bloomfield, P., 2000: Fourier Analysis of Time Series—An Introduction. 2nd ed. Wiley, 261 pp.

  • Ghil, M., and Coauthors, 2002: Advanced spectral methods for climatic time series. Rev. Geophys., 40 .1003, doi:10.1029/2000RG000092.

  • Hamming, R. W., 1989: Digital Filters. 3rd ed. Prentice Hall, 284 pp.

  • Madden, R. A., and Julian P. R. , 1971: Detection of a 40–50 day oscillation in the zonal wind in the tropical Pacific. J. Atmos. Sci., 28 , 702708.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mann, M. E., 2004: On smoothing potentially non-stationary climate time series. Geophys. Res. Lett., 31 .L07214, doi:10.1029/2004GL019569.

    • Search Google Scholar
    • Export Citation
  • Park, J., 1992: Envelope estimation for quasi-periodic geophysical signals in noise: A multitaper approach. Statistics in the Environmental and Earth Sciences, A. T. Walden and P. Guttorp, Eds., Halsted Press, 189–219.

    • Search Google Scholar
    • Export Citation
  • Pegion, P. J., Bourassa M. A. , Legler D. M. , and O’Brien J. J. , 2000: Objectively derived daily “winds” from satellite scatterometer data. Mon. Wea. Rev., 128 , 31503168.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Schoof, J. T., Arguez A. , Brolley J. , and O’Brien J. J. , 2005: A new weather generator based on spectral properties of surface air temperatures. Agric. For. Meteor., 135 , 241251.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Smith, T. M., and Reynolds R. W. , 2003: Extended reconstruction of global sea surface temperatures based on COADS data (1854–1997). J. Climate, 16 , 14951510.

    • Search Google Scholar
    • Export Citation
  • Thompson, D. W. J., and Wallace J. M. , 1998: The Arctic Oscillation signature in the wintertime geopotential height and temperature fields. Geophys. Res. Lett., 25 , 12971300.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Von Storch, H., and Zwiers F. W. , 1999: Statistical Analysis in Climate Research. Cambridge University Press, 484 pp.

  • Wheeler, M. C., and Hendon H. H. , 2004: An all-season real-time multivariate MJO index: Development of an index for monitoring and prediction. Mon. Wea. Rev., 132 , 19171932.

    • Crossref
    • Search Google Scholar
    • Export Citation

Fig. 1.
Fig. 1.

ENSO filter series. (top) ENSO time series utilized and (bottom) the FRF of the filter used on it.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 2.
Fig. 2.

AO filter series. As in Fig. 1, but for the AO.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 3.
Fig. 3.

MJO filter series. (top) The zonal velocity time series, (middle) the filter weights used, and (bottom) the associated FRF.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 4.
Fig. 4.

Subset analysis schematic. A contrived 24-point time series is utilized for a 5-month running mean. Green squares represent input values solely used to compute the “true” filtered values. Blue diamonds indicate the estimated output values. Gray circles indicate interior points unaffected by endpoint issues because complete convolutions can be computed.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 5.
Fig. 5.

ENSO root-mean-square errors. RMSEs as a function of position for the unconstrained (purple), equal-mean (green), equal-variance (blue), and spectral (red) methods for the AO project (top). A liberal skill threshold based on random filter output has an RMSE value of about 0.63.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 6.
Fig. 6.

ENSO endpoint variance ratio. Variance of estimates as a function of endpoint location for the ENSO case, scaled by the variance of the “true” values.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 7.
Fig. 7.

AO RMS errors. As in Fig. 5, but for the (top) AO project (skill threshold equals 0.4). (bottom) The least squares techniques’ RMSE values divided by the spectral RMSE values.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 8.
Fig. 8.

AO endpoint variances. As in Fig. 6, but for the AO case.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 9.
Fig. 9.

MJO RMS errors. As in Fig. 7, but for the MJO case (skill threshold equals 71.2).

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Fig. 10.
Fig. 10.

MJO endpoint variances. As in Fig. 6, but for the MJO case.

Citation: Journal of Atmospheric and Oceanic Technology 25, 4; 10.1175/2007JTECHA924.1

Table 1.

Empirically determined weights from the filters in Eqs. (16) (j = 0) and (17) (j = 1). The resulting RMSE and variances are also presented.

Table 1.
Save