## 1. Introduction

This paper considers a family of parsimonious correlation models potentially relevant to studies of surface temperature fields that may be modeled by random fields distributed in space and evolving in time [on stationary but non-time-dependent random fields see Cramér and Leadbetter (2004), Yaglom (1961, 1987), and Heine (1955)]. We will be particularly concerned with how the spatial correlation structure is modified by temporally smoothing the data. Our method is to follow in the steps of the autoregressive methods familiar from time series analysis but to extend them to the two-dimensional spatial domain as well, both on the plane and the surface of the sphere. Our motivation is based upon the governing equations of simple energy balance climate models (EBCMs; see, e.g., North et al. 1983, hereinafter referred to as NMS83) that have been studied over several decades. There are potential applications in estimation problems such as the construction of smooth meteorological fields and space–time averages from datasets consisting of a finite number of point measurements at the surface (e.g., Gandin and Smith 1997). In this study we confine ourselves to models with uniform spatial statistics. This implies that the statistics are translationally stationary and isotropic on the plane or rotationally invariant on the sphere. This allows us to extract near-analytic solutions that can be compared to appropriate datasets. Of course, these idealizations do not strictly hold for the surface temperatures in general, since land surfaces induce very different statistical properties from sea surfaces. We look forward to including such discontinuities that occur at the land–sea boundaries, but first it is instructive to investigate in some detail the uniform-surface cases. We can do this in a limited way by utilizing data from a large relatively uniform landmass or oceanic region.

The pioneering work of Hansen and Lebedeff (1987, hereinafter referred to as HL87) showed that when annually averaged station-collected surface temperatures were collected and partitioned according to the distances separating the stations their correlations decayed in a near-exponential fashion as a function of separation. They showed these correlation-scatter diagrams for different latitude bands (we reproduce part of the relevant figure from HL87 in Fig. 1). The (visual) fits to simple exponentials showed less scatter in northern high latitudes, and the goodness of the fit deteriorated in the tropics with little sign of decay with separation. Visual inspection of the HL87 curves suggests a characteristic decay scale (separation where the correlation estimate falls below 1/*e*) of about 1800 km outside the tropics. For illustrative purposes we show a similar correlation scatterplot in Fig. 2 that is based on annually averaged data taken from a fairly homogeneous land surface in eastern Siberia. The data for this study are from the Research Data Archive (RDA), which is maintained by the Computational and Information Systems Laboratory at the National Center for Atmospheric Research. The original data were obtained online from the RDA (http://dss.ucar.edu/datasets/ds524.0/; includes map of station locations). This dataset consists of instrument data from 223 stations and runs from 1881 to 1989; of course, each record has a different length. Our analysis was crude, simply looking at a long time series of annual averages and estimating the correlation of data from different points with those from a central point in Asia. The scatter is large primarily because the records are so short and the natural variability on this continental landmass is large. The blue line in the figure is the mean at a particular separation over the vertical values of points in a small interval surrounding that separation. The red curve is the function *rK*_{1}(*r*) in units of 3000 km. The reason for this choice will be shown below. Our point is simply the similarity to the analysis of HL87, only over a more homogeneous land surface. We suggest that the latter homogeneous land surface as opposed to the mix of land and ocean surfaces results in a correlation length (distance to where the correlation falls to 1/*e*) that is about 50% larger.

Spatial autocorrelation of data from eastern Siberia. The data were annually averaged. The red curve is based on the correlation model introduced by Whittle (1954): *rK*_{1}(*r*), where *K*_{1} is the modified Bessel function. The distance *r* is in kilometers. The decorrelation length scale can be estimated to be approximately 2800 km, which is about 50% larger than in HL87.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Spatial autocorrelation of data from eastern Siberia. The data were annually averaged. The red curve is based on the correlation model introduced by Whittle (1954): *rK*_{1}(*r*), where *K*_{1} is the modified Bessel function. The distance *r* is in kilometers. The decorrelation length scale can be estimated to be approximately 2800 km, which is about 50% larger than in HL87.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Spatial autocorrelation of data from eastern Siberia. The data were annually averaged. The red curve is based on the correlation model introduced by Whittle (1954): *rK*_{1}(*r*), where *K*_{1} is the modified Bessel function. The distance *r* is in kilometers. The decorrelation length scale can be estimated to be approximately 2800 km, which is about 50% larger than in HL87.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

North and Cahalan (1981) introduced a noise-forced EBCM on the uniform sphere to examine the concept of predictability of individual decay modes in simple climate models. In that study they made use of both the space and time dependence of the resulting random temperature field. Kim et al. (1996) showed that in data and in model simulations (GCMs as well as noise-forced EBCMs) correlation lengths differ greatly depending on whether the underlying surface is land or sea. The large-spatial-scale relaxation time in EBCMs is a month or two over land [see also Manabe and Strickler (1964), who used a radiative convective model] and a few years over large-scale ocean areas—this last because of the heat capacity of the entire mixed layer of the ocean (about 50–100 m thick) is participating in thermal changes. This means that over land the autocorrelation time of annual averages of the datastream is several times the relaxation (or correlation) time and is essentially in the low-frequency limit of Fourier frequency components. On the other hand, over large-scale expanses of ocean, the autocorrelation time for annual averages is only a fraction of the relaxation time. This fact leads to very short autocorrelation lengths over oceans, as seen in Kim et al. (1996). The HL87 analysis used annual averages, but they were mixed ocean and land sites and hence lots of scatter and possibly a shorter length scale than the uniform low-frequency limit might be the case.

HL87 argued that given the strong correlation between neighboring stations (they conservatively chose 1000 km as their characteristic length scale) one could estimate the annual average temperatures of the global surface at a tolerable accuracy with only a finite number of reasonably uniformly spaced sites on the sphere—perhaps on the order of a few hundred. Shen et al. (1994) and others have shown that, by using subsets of data as compared with the larger dataset containing thousands of sites, one could obtain very reasonable results with as few as 64 (=8^{2}) or even 49 (=7^{2}) well-placed sites. By very reasonable, they meant that the decadal variations of the global average temperature fields were resolved in comparison with sampling error. If correlation lengths are longer over land, as suggested in Figs. 2 and 3, it can be argued that even fewer uniformly spaced sites are necessary for estimating large-scale averages, at least over land. This will not be the case over ocean for annual averages. In a heuristic way, the number 49 can be estimated from the 1800-km radius of a “correlation disk”; dividing this disk area into the area of the earth’s surface yields a little over 49. This means that there are about 49 independent oscillators on the earth. The standard error in estimating an area average goes down as *n*^{−1/2}; therefore, it is reduced by a factor of 7 or 8 over the standard deviation of annual averages of global average. Adding more sites beyond 49 will continue to reduce the error but much more slowly than the *n*^{−1/2} dependence because the additional sites added will be correlated and therefore partially redundant [for some quantitative estimates, see North et al. (1992), Hardin et al. (1992), and Shen et al. (1996)]. The number of “independent oscillators” on the sphere is referred to as the number of degrees of freedom of the random field [see Wang and Shen (1999) for estimates of these for the real earth; this paper also contains many references to this concept for meteorological fields]. We will return to this interesting concept later.

Contours of equal correlation from four sources (top left) observations, (top right) noise-forced EBCM, (bottom left) a GCM simulation from a circa- 1990s model from the Geophysical Fluid Dynamics Laboratory, and (bottom right) a similar GCM simulation from the Max Planck Institute (Hamburg, Germany). The datastreams were bandpassed to include periods from 2 months to 1 yr. Each field employs six protosites, and surface temperatures at neighboring locations are correlated with that site. The thick contour represents when the correlation falls to 1/*e*. Reproduced from Kim et al. (1996).

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Contours of equal correlation from four sources (top left) observations, (top right) noise-forced EBCM, (bottom left) a GCM simulation from a circa- 1990s model from the Geophysical Fluid Dynamics Laboratory, and (bottom right) a similar GCM simulation from the Max Planck Institute (Hamburg, Germany). The datastreams were bandpassed to include periods from 2 months to 1 yr. Each field employs six protosites, and surface temperatures at neighboring locations are correlated with that site. The thick contour represents when the correlation falls to 1/*e*. Reproduced from Kim et al. (1996).

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Contours of equal correlation from four sources (top left) observations, (top right) noise-forced EBCM, (bottom left) a GCM simulation from a circa- 1990s model from the Geophysical Fluid Dynamics Laboratory, and (bottom right) a similar GCM simulation from the Max Planck Institute (Hamburg, Germany). The datastreams were bandpassed to include periods from 2 months to 1 yr. Each field employs six protosites, and surface temperatures at neighboring locations are correlated with that site. The thick contour represents when the correlation falls to 1/*e*. Reproduced from Kim et al. (1996).

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Noise-forced EBCMs work well in middle and higher latitudes but fare less well in the tropics. The reason for this is that the upper latitudes are more prone to the noisy weather disturbances that have time scales on the order of days and spatial scales on the order of 1000 km. Since the radiative relaxation time of a column of air and over oceans is two orders of magnitude longer, we have the classic first-order autoregressive (AR1) process, wherein the weather serves as the white-noise driver and the radiative damping serves as the response, as pointed out by Hasselmann (1976). In the tropics we have a very different situation. There is no weather noise forcing and length scales are much longer, stretching great distances in the longitudinal direction. It may be that there are even fewer degrees of freedom in the tropical surface temperature field than in the mid- and higher latitudes (meaning that even fewer sites would be necessary for a good estimate of large-area averages). Of course, the tropics are also complicated by the El Niño–Southern Oscillation phenomenon, which is well outside the scope of EBCMs. For the present, we will acknowledge that our method will not hold in the tropics. In addition, in regard to the limitation of this class of EBCMs, we mention that the mixed layer of the ocean is an idealization. Below it there is the deeper ocean, and, if the frequency is low enough, we will activate the deeper ocean, increasing the effective heat capacity even more. The turnover time of the deep ocean is many hundreds of years. This is discussed in many sources, but we refer the reader to the very recent work of Held et al. (2010) for an explanation of how this coupling works and affects the transients in climate change.

In this paper we derive some simple parametric forms for frequency-dependent correlation models that provide insight into the nature of correlation structure in data analysis and that could be of utility in practice. The approach is based on simple physical climatelike models. These linear damped diffusion models are driven by white noise in space and time and take the form of familiar autoregressive statistical models AR1 in time and second-order autoregressive (AR2) in space [for studies of these kinds of models on the sphere see, e.g., North and Cahalan (1981) and Kim and North (1991, 1992)]. For studies of uniform (rotationally invariant) random fields on the sphere, refer to North and Cahalan (1981); see also, among the earliest of these, the studies by Obukhov (1947) and Jones (1963). These latter papers show the efficacy of using spherical harmonics in solving uniform-sphere problems. A modern treatment of spherical harmonics for scientists is provided by Arfken and Weber (2001). A recent approach to studying correlation models for nonstationary random fields on the sphere is reported by Jun and Stein (2008).

*ω*filtered out of the datastream. This is easily extended to the low-pass case, in which the frequency components that are lower than

*ω*are retained. The case of moving averages is also considered. All three have the similar property that in the limit

*ω*→ 0 (or long moving average) they lead to a finite limiting decorrelation length [in the low-frequency limit,

*T*is the temporal Fourier transform of

_{ω}*T*(

*x*,

*y*,

*t*)] and all spatial dependence on

*C*(

*x*,

*y*) vanishes, and we are left with the long-time-average model; see (1) below.

The plan of the paper is first to introduce the EBCM forms (noise-driven damped diffusion) on the uniform plane and then in section 3 to extend this formalism to the surface of a uniform sphere. Section 4 examines the differences for our analysis for different ways of filtering the datastream (narrowband pass and low pass vs moving average). A discussion and concluding remarks are in section 5.

## 2. Simplest EBCM forms

*T*(

*x*,

*y*,

*t*) is (nominally) the temperature in the

*x–y*plane at time

*t*,

*C*(

*x*,

*y*) is a local effective heat capacity,

*D*is a thermal diffusion coefficient, and

*B*is a damping coefficient (radiation to space in the climate model). In the equation,

*BF*(

*x*,

*y*,

*t*) is a climatic driving term that could be a solar driver such as the seasonal cycle, but we will take it to be a noise term that excites fluctuations in the response,

*T*(

*x*,

*y*,

*t*). The noise is to be white (no correlation between neighboring times or locations). We will be crude in our treatment of this stochastic process, where possible ignoring some of the complications arising in the classical studies of Brownian motion, by thinking of the time and space steps as finite (or, equivalent, the spectra are cut off at finite upper limits) but using continuous differential forms when there is no danger of singularities popping up.

The study of random fields based on the solutions of differential equations driven by noise has a long history, including papers on the Brownian motion of particles by Einstein (1905), the collection of papers of Wax (1954), and the monograph by Gardiner (1985).

*B*to obtain

*τ*=

*C*/

*B*is a relaxation time scale and

*λ*= (

*D*/

*B*)

^{1/2}is a length scale. These two scale parameters will prove useful throughout the study. Note that

*τ*and

*λ*are the only two parameters that enter this study. In data analysis these are the two parameters that would need to be estimated.

First, notice that the governing equation in (1) is not so strange from the point of view of statistical modeling. It is simply a first-order process in time (AR1) and a second-order process in space (AR2). The ∇^{2} is a rotationally invariant operator insuring that the statistical properties of an ensemble of solutions will be rotationally invariant in the plane. Hence, our process is the lowest-order autoregressive process that can be constructed that is rotationally invariant in the plane. Anisotropy can be introduced with a two-dimensional symmetric tensor for *D* (introducing a second length-scale parameter), and some anisotropy is present in the data; we will ignore that complication in this study, however. If *D* were to depend on position, it would occur in the form **∇** · *D***∇**, but it will be constant and isotropic in this study.

In general, the effective heat capacity *C*(*x*, *y*) depends on position. Over land it is small. The relaxation time of a large land area due to radiation to space is on the order of a month or two, whereas over the ocean surfaces this characteristic time is larger by at least a factor of 10–80. Hence, at land–sea boundaries there is a large discontinuity in *C*(*x*, *y*) and therefore also in *τ*(*x*, *y*). In a global model that is solved numerically, this problem is not serious; in analytical studies, however, this problem is a formidable impediment to obtaining simple parametric forms for solutions and correlation models. In addition, in the real world *D* might be dependent on *x* and *y* as well. It might even be anisotropic (examination of the data shows that it is mildly anisotropic in midlatitudes, as expected). The weakly anisotropic case can be handled reasonably easily in our problem by the use of a tensor form for the diffusion coefficient. But we defer inclusion of these complications for this study, relying on the uniform cases wherein *C*, *B*, and *D* are constants independent of position. We reproduce in Fig. 3 a figure from Kim et al. (1996), wherein six protosites are used as locations from which correlations with neighboring sites are correlated. The closed contours represent a locus of equal correlation with data at the center point. The heavy contour indicates where the correlation falls to 1/*e*. The panels illustrate the fact that correlation lengths depend on the land–sea surface type. We do not claim that this is fair to GCMs of the mid-1990s, when the models were far more primitive than those of today. There are several points in Fig. 3 that are worthy of comment:

The correlation lengths are longer over land than over large oceanic expanses for this datastream, which was bandpassed for periods between 2 months and 1 year.

Even the correlation contour at land’s edge in San Francisco, California, shows short lengths to the west and long lengths to the east.

Note the hint of ENSO in the top-left panel but none yet apparent in the GCMs and certainly not in the EBCM. All current coupled GCMs have some form of spontaneous ENSO.

Note the Himalayan-induced flattening on the equatorward side of the contour centered in middle Asia in all except the (flat!) EBCM.

The paper by Kim et al. (1996) includes some additional figures showing that the contours swell horizontally as the bandpassed frequencies become lower. This last is the major point to be raised in the present paper. We believe that obtaining good agreement among second-order statistics, such as spatial correlations, is a necessary condition for trusting a model for various purposes. The agreement we find in Fig. 3 gives us confidence that the EBCM can provide useful—but, of course, limited—guidance in exploring various correlation models and their behavior with respect to the frequency dependence of correlation patterns.

*C*, thermal conductivity

*D*, and damping (Newtonian cooling, say, to the air above) coefficient

*B.*The noise on the right-hand side is that of a “mad” Gaussian heater/cooler who warms and cools spots at random in space and time. The resulting solution is a random field responding to the mad heater with a smoother field whose large-scale averages decay with time scale

*τ*and whose long-term spatial autocorrelation lengths are

*λ*(to be shown presently). A heuristic view might consist of a temperature anomaly random-walking away from its initial location. The anomaly spreads horizontally proportional to

*t*

^{1/2}. There is the relaxation time

*τ*, however. The distance “diffused” in that relaxation time is just the length scale

*λ*. The correlation length then is the

*root-mean-square*distance that a disturbance covers in a single relaxation time for the field. This can be verified from the solution to the initial-value problem expressed as the partial differential equation in which at time

*t*= 0 all heat energy is concentrated at the origin (indicated by the Dirac delta function

*δ*(

**r**), where

**r**is the position vector

**r**=

*x*

**i**+

*y*

**j**). The solution to the homogeneous version of (2) yields a temperature distribution in the plane that is independent of polar angle and dependent only on radial distance from the origin

*r*= |

**r**| and time

*t*:

*r*of 2

*πr*times this function (the total heat content) decays exponentially with time constant

*τ*, and the radial width of the function

*T*(

*r*,

*t*) is proportional to (2

*λ*

^{2}

*t*/

*τ*)

^{1/2}. Solutions such as this one can be found in some of the earliest papers on Brownian motion [e.g., Einstein (1905) and papers in and referred to in the collection by Wax (1954)].

*τ*, such that all transients have died away and we are left with the statistically steady forced solutions. We introduce the white-noise forcing component at angular frequency

*ω*and vector wavenumber

**k**:

**r**=

*x*

**i**+

*y*

**j**. We can write the response at this same (angular) frequency [=(2

*π*)/(cycle period)] and wavenumber (|

**k**| = 2

*π*/wavelength) as

*ω*and separation

*r*by

*C*(

_{ω}*r*). By symmetry we have assumed that the covariance can only depend on the separation

*r*, and

*J*

_{0}is the Bessel function of the first kind of order zero. The angular integral is known as Bessel’s integral.

*ω*→ 0. In this case the remaining integral over

*k*can be performed (e.g., by using Mathematica software):

*ρ*

_{0}(0) = 1] is given by

*K*

_{1}is the modified Bessel function of degree 1. This form is known to statisticians as Whittle’s correlation model (Whittle 1954) and is a particular member of the Matérn class (Matérn 1960); for a review of this family of correlation functions, see Guttorp and Gneiting (2006). For a Bayesian approach to kriging of spatial random fields also using this form, see Handcock and Stein (1993). It is noteworthy that in linear systems such as this, the forcing variance

*ω*> 0 can also be found:

*b*= (1 +

*iωτ*)

^{1/2½}. Note that

*r*always occurs in the combination

*r*/

*λ*and that

*ω*always occurs in the combination

*ωτ.*The slope of

*C*(

_{ω}*r*) vanishes as

*r*→ 0 for finite

*ω.*This assures that the spatial random fields on the plane generated in this class of models will be smooth. This is in contrast with the covariance function

*e*, which is sometimes used. Note the shape of the correlation function in Fig. 2 as

^{−αr}*r*→ 0.

Figure 4 shows a family of theoretical autocorrelation curves, each for a different value of the dimensionless parameter *ωτ*. First, note that they approach a limiting form as *ω* → 0. Second, the distal extent (roughly measured by the value of *ρ* when its argument falls to *e*^{−1}) of the correlation shortens dramatically as one increases the frequency of the bandpass center.

EBCM-computed correlation of surface temperature fluctuations between sites that are a distance *r*/*λ* apart [(6)]. The geometry is a flat plane with constant values of the coefficients in the EBCM. Different curves are for values of the dimensionless parameter *ωτ*, where *ω* is the (angular) bandpass frequency of the observations and *τ* is the relaxation time of the large-scale temperature field.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

EBCM-computed correlation of surface temperature fluctuations between sites that are a distance *r*/*λ* apart [(6)]. The geometry is a flat plane with constant values of the coefficients in the EBCM. Different curves are for values of the dimensionless parameter *ωτ*, where *ω* is the (angular) bandpass frequency of the observations and *τ* is the relaxation time of the large-scale temperature field.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

EBCM-computed correlation of surface temperature fluctuations between sites that are a distance *r*/*λ* apart [(6)]. The geometry is a flat plane with constant values of the coefficients in the EBCM. Different curves are for values of the dimensionless parameter *ωτ*, where *ω* is the (angular) bandpass frequency of the observations and *τ* is the relaxation time of the large-scale temperature field.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

We have taken some surface temperature data from the midlatitude Pacific Ocean as well as those from eastern Siberia to illustrate the effect. Here we did not employ bandpass, but rather a series of moving averages on the data. Figure 5 shows results for these calculations. We suspect that an effective value of *τ* for the ocean surface data is a few years (or even more at very low frequencies; see, e.g., Held et al. 2010) (NMS83); depending on the mixed layer depth, season, location, and for very low frequencies, it may be even larger as more depth of ocean is shared in the response. Nevertheless, we can see a pattern emerging that is similar to the mathematical construct in Fig. 4. The curves in Fig. 5 tend to flatten for large values of separation. This is likely to be due to long-term (large spatial scale) trends in the time series *(*e.g., global warming or basinwide coherent multidecadal oscillations). This effect was partially removed by treating residuals from straight-line detrended time series of the individual grid boxes from which the data were taken. Of course, we realize that the time series were influenced by the procedures used in transferring the ocean surface data onto the grid prior to our procurement of them. On the other hand, we doubt whether the feature in which we are interested is very dependent on this effect. The data for the ocean surface were obtained online from the RDA (http://dss.ucar.edu/datasets/ds277.0/). These gridded (5° × 5°) data span 1854–2011 and are described in Smith et al. (2008).

Correlation of surface temperatures from sites separated by a distance *r*. Solid lines are for sites over the North Pacific; dashed lines are for uniform land areas in Siberia.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Correlation of surface temperatures from sites separated by a distance *r*. Solid lines are for sites over the North Pacific; dashed lines are for uniform land areas in Siberia.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Correlation of surface temperatures from sites separated by a distance *r*. Solid lines are for sites over the North Pacific; dashed lines are for uniform land areas in Siberia.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

All of the curves show the same tendency toward longer correlation lengths as the averaging time of the data is increased. It is far more pronounced with the land data, but we suspect that the assumption of uniform surface is violated as the distances become large (some distant points will “feel” the ocean). The ocean surface data are more orderly partly because the natural variability is less, reducing sampling errors, but again this may be influenced by the procedures used by Smith et al. (2008) in taking those data onto a uniform grid in the North Pacific. Table 1 shows estimates of decorrelation length as a function of the moving-average interval for Siberia and the North Pacific.

Decorrelation lengths (km) for different averaging intervals for Siberia and the North Pacific.

## 3. EBCM correlations on the uniform sphere

*D*→

*DR*

^{2}). We may write

*μ*= cos

*θ*,

*θ*is the polar angle, and

*ϕ*is longitude. The length scale

*λ*is in units of earth radius (6400 km).

*Y*

_{n}_{,m}are the eigenfunctions of ∇

^{2}:

*π*steradians of solid angle on the sphere. Now a temperature or forcing field can be expanded into the basis set:

*e*

^{−iωt}governing equation as in the planar case.

*ω*,

*P*is the

_{n}*n*th Legendre polynomial and

*μ*= 1) and the other at polar angle

*θ*= cos

^{−1}

*μ*so that we can write

*λ*/

*R*= 0.4, corresponding to a length scale of 2560 km.

EBCM-computed correlation of surface temperature fluctuations between sites a distance *Rθ*/*λ* apart on the sphere [computed from (9) after normalization, such that *ρ _{ω}*(0) = 1]. The geometry is a featureless spherical surface (constant values of the coefficients) in the EBCM. Different curves are for values of the dimensionless parameter

*ωτ*, where

*ω*is the (angular) bandpass frequency of the observations and

*τ*is the relaxation time of the large-scale temperature field. Numerical values are for

*λ*/

*R*= 0.4.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

EBCM-computed correlation of surface temperature fluctuations between sites a distance *Rθ*/*λ* apart on the sphere [computed from (9) after normalization, such that *ρ _{ω}*(0) = 1]. The geometry is a featureless spherical surface (constant values of the coefficients) in the EBCM. Different curves are for values of the dimensionless parameter

*ωτ*, where

*ω*is the (angular) bandpass frequency of the observations and

*τ*is the relaxation time of the large-scale temperature field. Numerical values are for

*λ*/

*R*= 0.4.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

EBCM-computed correlation of surface temperature fluctuations between sites a distance *Rθ*/*λ* apart on the sphere [computed from (9) after normalization, such that *ρ _{ω}*(0) = 1]. The geometry is a featureless spherical surface (constant values of the coefficients) in the EBCM. Different curves are for values of the dimensionless parameter

*ωτ*, where

*ω*is the (angular) bandpass frequency of the observations and

*τ*is the relaxation time of the large-scale temperature field. Numerical values are for

*λ*/

*R*= 0.4.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

## 4. Moving average versus bandpass at *ω*

*T*(

*t*). Taking the inverse Fourier transformation of (5), we have

*T*(

_{n,m}*t*) results in

*δ*is the Dirac delta function, the space spectral density, which is given by

*M*|

_{ω}^{2}= (2Ω)

^{−1}for −Ω ≤

*ω*≤ Ω and is 0 otherwise, which leads to

*b*

^{2}=

*λ*

^{2}

*n*(

*n*+ 1).

*b*is the same as above. Figure 7 shows the correlation temperature time series between separated points on the sphere for different values of 2

*πτ*/Δ, where

*τ*is the relaxation time of the random temperature field and Δ is the width of the box-shaped moving average that was applied to the evolving field. In rough terms, the uppermost curves represent the cases for which the most lower-frequency Fourier components are retained.

Autocorrelation between points in the correlation model on the sphere as function of great-circle separation *s = Rθ*/*λ* (same units as in Fig. 6). Note the convergence toward the upper curve as the averaging length increases. In this example, *λ*/*R* = 0.4. The curves are labeled by the dimensionless parameter 2*πτ*/Δ, where *τ* is the characteristic time of the random field and Δ is the width of the box-shaped moving average. The uppermost curve corresponds to the longest range of the temporal moving average.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Autocorrelation between points in the correlation model on the sphere as function of great-circle separation *s = Rθ*/*λ* (same units as in Fig. 6). Note the convergence toward the upper curve as the averaging length increases. In this example, *λ*/*R* = 0.4. The curves are labeled by the dimensionless parameter 2*πτ*/Δ, where *τ* is the characteristic time of the random field and Δ is the width of the box-shaped moving average. The uppermost curve corresponds to the longest range of the temporal moving average.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

Autocorrelation between points in the correlation model on the sphere as function of great-circle separation *s = Rθ*/*λ* (same units as in Fig. 6). Note the convergence toward the upper curve as the averaging length increases. In this example, *λ*/*R* = 0.4. The curves are labeled by the dimensionless parameter 2*πτ*/Δ, where *τ* is the characteristic time of the random field and Δ is the width of the box-shaped moving average. The uppermost curve corresponds to the longest range of the temporal moving average.

Citation: Journal of Climate 24, 22; 10.1175/2011JCLI4199.1

In the case of the infinite plane, we can similarly reduce the problem to a one-dimensional integral (as compared with the sum for the spherical surface). In finding the spatial spectrum, 〈|*T*** _{k}**|

^{2}〉, we encounter the same integral as in (9) with

*n*(

*n*+ 1) replaced by

*k*Also the factor

^{2}.*e*

^{i}

^{k}^{·r}must be inserted to invert the Fourier transform. This means that both the low-pass and moving-average filters can be solved as in the spherical-surface case. Next, instead of the sum over the spherical harmonic indices

*m*and

*n*, we must perform the double integral over

*k*and

_{x}*k*, which by symmetry (as in use of the addition theorem) we can use the polar coordinates

_{y}*k*and

*θ*, leading to the integral found in (4) and (5). Hence, in the planar case we are able to reduce the problem to a one-dimensional integral, which can easily be performed numerically if not analytically as a function of the filter parameter.

## 5. Discussion and conclusions

This paper introduced some analytical parametric forms for correlations of random temperature fields on the plane and on a spherical surface. In each case the time series were assumed to be stationary and the spatial field was assumed to be statistically homogeneous and isotropic. The forms are motivated by simple stochastic climate models providing some physical insight into the dependencies. In the homogeneous and isotropic models considered in this paper, the autocorrelation function of surface temperature fluctuations between separated sites exhibits a correlation length (here taken as the distance at which the autocorrelation falls to *e*^{−1}) that lengthens with decreasing frequency, approaching a limiting value as the frequency of the corresponding Fourier component of the datastream tends to zero. A low frequency is the range wherein its inverse is much longer than the relaxation time of the random field for large-area averages *τ*. Filtered data of higher and higher frequency exhibit shorter and shorter autocorrelation lengths. The relaxation time of surface temperature over homogeneous land areas is on the order of a month or two, whereas over ocean surfaces it is a few years. Hence, the meaning of high and low frequency depends on the heat-capacity characteristics of the surface—land or ocean, and perhaps topography. The number of statistically independent regions on the sphere that can be thought of as the effective number of degrees of freedom is likely to diminish as data are low-pass filtered.

We tested our correlation model with some data from fairly homogeneous ocean and land surfaces—the north central Pacific and Siberia. Both datasets exhibited the frequency dependence suggested by the homogeneous EBCM, taking into account the large difference in *τ* over the two surface types. Correlation lengths were very large over both surfaces for *ωτ* ≪ 1, exceeding 2500 km over the ocean surface. These lengths might be even larger for decadally averaged data. Over land the correlation lengths at low frequencies were even larger, but the data quality and the assumptions regarding homogeneity were suspect at the lower frequencies. We also showed that on the sphere the same conclusions hold for moving-averaged data as opposed to data filtered to a narrow frequency band. Although record lengths of instrumental data for testing our hypothesis are lacking, it should be possible to test it with long control runs of GCMs. Moreover, models should have the same correlation characteristics as data even given the limited records of the datastream, as suggested by the early work by Kim et al. (1996).

Left partially open is the complication of land and sea distribution. Variance of the surface temperature fields depends strongly on the positioning of land and sea in data, EBCMs, and general circulation model simulations (Kim et al. 1996) with large variance over (midlatitude) continental interiors and small variance over the ocean surface. The variance field smoothes as the filtering frequency is lowered. Similarly, the correlation lengths are long over continental interiors and short over oceans. As the frequency is lowered to less than one per few years the ocean and land surfaces appear to homogenize, however. This suggests that some further work on correlation models with land–sea borders might be of use.

One conjecture that comes from this is that long time averages of the datastream (or very-low-pass-filtered datastreams) can lead to very long correlation lengths. This has implications for paleoclimatic reconstructions, in the sense that many of these datastreams have been filtered by biogeophysical processes to very long times (e.g., ocean sedimentary cores). Fine-resolution data can always be smoothed in the laboratory or on the computer. This suggests that in many cases, it might be possible to provide useful estimates of the low-frequency changes in global or hemispheric average temperatures with only a few well-separated sites over the entire earth. For example, a single time series from an ice core in Antarctica averaged over 1000 yr might be indicative of the entire Southern Hemispheric average temperature or possibly even that of the entire globe.

An important caveat is that the EBCM is hardly expected to hold in tropical areas because the midlatitude storms provide the “noise” for the stochastic model. In the tropics (say, |latitude| < 30°), the transport of heat is dominated by the Hadley circulation, which is a more direct flow than diffusive. This is actually helpful in the estimation of large-area averages, since it suggests that the tropics are mostly homogeneous, that is, that they have correlations that cover an entire latitude belt. Only a few gauges should be sufficient to get a good area mean.

## Acknowledgments

We acknowledge partial support from both the Harold J. Haynes Endowment at Texas A&M University and NSF Grants CMG ATM-0620624 and DMS-1007504. This publication is based in part on work supported by Award KUS-C1-016-04 made by King Abdullah University of Science and Technology (KAUST).

## APPENDIX

## REFERENCES

Arfken, G. B., and H. J. Weber, 2001:

*Mathematical Methods for Physicists*. 5th ed. Academic Press, 1112 pp.Cramér, H., and M. R. Leadbetter, 2004:

*Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications*. Dover, 348 pp.Einstein, A., 1905: On the motion of small particles suspended in liquids at rest required by the molecular-kinetic theory of heat (in German).

,*Ann. Phys.***17**, 549–560.Fisher, N. I., T. Lewis, and B. J. J. Embleton, 1987:

*Statistical Analysis of Spherical Data*. Cambridge University Press, 329 pp.Gandin, L., and T. Smith, Eds., 1997:

*Averaging of Meteorological Fields*. Atmospheric and Oceanographic Sciences Library, Vol. 19, Kluwer Academic, 296 pp.Gardiner, C. W., 1985:

*Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences*. 2nd ed. Springer-Verlag, 442 pp.Guttorp, P., and T. Gneiting, 2006: Studies in the history of probability and statistics XLIX: On the Matern correlation family.

,*Biometrika***93**, 989–995.Handcock, M. S., and M. L. Stein, 1993: A Bayesian analysis of kriging.

,*Technometrics***35**, 403–410.Hansen, J., and S. Lebedeff, 1987: Global trends of measured surface air temperature.

,*J. Geophys. Res.***92**, 13 345–13 372.Hardin, J. W., G. R. North, and S. S. P. Shen, 1992: Minimal error estimates of global mean temperature through optimal arrangement of gauges.

,*Environmetrics***3**, 15–27.Hasselmann, K., 1976: Stochastic climate models: I. Theory.

,*Tellus***28**, 473–485.Heine, V., 1955: Models for two-dimensional stationary stochastic processes.

,*Biometrika***42**, 170–178.Held, I., M. Winton, K. Takahashi, T. Delworth, F. Zeng, and G. K. Vallis, 2010: Probing the fast and slow components of global warming by returning abruptly to preindustrial forcing.

,*J. Climate***23**, 2418–2427.Jones, R. H., 1963: Stochastic processes on a sphere.

,*Ann. Math. Stat.***34**, 213–218.Jun, M., and M. L. Stein, 2008: Nonstationary covariance models for global data.

,*Ann. Appl. Stat.***2**, 1271–1289.Kim, K.-Y., and G. R. North, 1991: Surface temperature fluctuations in a stochastic climate model.

,*J. Geophys. Res.***96**, 18 573–18 580.Kim, K.-Y., and G. R. North, 1992: Seasonal cycle and second-moment statistics of a simple coupled climate system.

,*J. Geophys. Res.***97**, 20 437–20 448.Kim, K.-Y., G. R. North, and G. C. Hegerl, 1996: Comparisons of the second-moment statistics of climate models.

,*J. Climate***9**, 2204–2221.Manabe, S., and R. R. Strickler, 1964: Thermal equilibrium of the atmosphere with convective adjustment.

,*J. Atmos. Sci.***21**, 361–385.Matérn, B., 1960: Spatial variation: Stochastic models and their application to some problems in forest surveys and other sampling investigations.

,*Medd. Statens Skogsforskningsinst.***49**, 1–144.North, G. R., and R. F. Cahalan, 1981: Predictability in a solvable stochastic climate model.

,*J. Atmos. Sci.***38**, 504–513.North, G. R., J. G. Mengel, and D. A. Short, 1983: A simple energy balance model resolving the seasons and the continents: Application to the Milankovitch theory of the ice ages.

,*J. Geophys. Res.***88**, 6576–6586.North, G. R., S. S. P. Shen, and J. W. Hardin, 1992: Estimation of global mean temperature with point gauges.

,*Environmetrics***3**, 1–14.Obukhov, A. M., 1947: Statistically homogeneous fields on a sphere.

,*Usp. Mat. Nauk***2**, 196–198.Shen, S. S. P., G. R. North, and K.-Y. Kim, 1994: Spectral approach to optimal estimation of the global average temperature.

,*J. Climate***7**, 1999–2007.Shen, S. S. P., G. R. North, and K.-Y. Kim, 1996: An optimal method to estimate the spherical harmonic components of the surface air temperature.

,*Environmetrics***7**, 261–276.Smith, T. M., R. W. Reynolds, C.T. Peterson, and J. Lawrimore, 2008: Improvements to NOAA’s historical merged land–ocean surface temperature analysis (1880–2006).

,*J. Climate***21**, 2283–2296.Wang, X., and S. S. Shen, 1999: Estimation of spatial degrees of freedom of a climate field.

,*J. Climate***12**, 1280–1291.Wax, N., Ed., 1954:

*Selected Papers on Noise and Stochastic Processes*. Dover, 337 pp.Whittle, P., 1954: On stationary processes in the plane.

,*Biometrika***41**, 434–449.Yaglom, A. M., 1961: Second-order homogeneous random fields.

*Contributions to Probability Theory,*J. Neyman, Ed., Vol. 2,*Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability,*University of California Press, 593–622.Yaglom, A. M., 1987:

*Basic Results*. Vol. 1,*Correlation Theory of Stationary and Related Random Functions,*Springer, 526 pp.