## Abstract

Regional climate trends are of interest both for understanding natural climate processes and as tests of anthropogenic climate change signatures. Relative to global trends, however, their detection is hampered by smaller datasets and the influence of regional climate processes such as the Southern Oscillation. Regional trends are often presented by authors without demonstration of statistical significance. In this paper, regional-average annual series of air temperature and sea surface temperature for the New Zealand region are analyzed using a systematic statistical approach that selects the optimum statistical model (with respect to serial correlation, linearity, etc.), explicitly models the interannual variability associated with observable regional climate processes, and tests significance. It is found that the residuals are stationary and are a red noise process [ARMA(1,0)] for all the series examined. The SOI and a meridional circulation anomaly index (both high-pass filtered) are “explanatory variables” for interannual variability. For national-average air temperature (AT), linear and exponential trend models are equally valid but for simplicity the linear model is preferred. Failure to model the serial correlation in AT would result in an estimated confidence interval for trend that is too small by 36%. The use of the explanatory variables tightens the confidence interval by 15%.

Significant trends were detected. The trend in AT for 1896–1994 is 0.11 ± 0.035°C decade^{−1} (95% confidence interval). This is about double the trend reported for global data, which may be due to the relative absence of sulfate aerosols in the South Pacific region. The trends in maximum and minimum temperature over this period are not statistically different. However, for the later period of 1951–90, the trend in maximum temperature reduces to an insignificant value, while the trend in minimum temperature remains high, resulting in a significant downward trend in diurnal range of 0.10°C decade^{−1}. Similar diurnal range behavior in other regions has been tentatively attributed to increasing cloudiness. The trend in a regional SST series for 1928–94, 0.07°C decade^{−1}, is about half the trend in AT for the same period. The trend in the difference, SST–AT, −0.06°C decade^{−1}, is statistically significant. This implies the existence of an atmospheric warming source for the additional air temperature trend, and may mean that the heat fluxes between the atmosphere and ocean in the New Zealand region are subject to a large trend, with the direction of flux change being toward the ocean. The results of the study are consistent with the IPCC predictions of climate change.

## 1. Introduction

The detection of climate change, including those predicted to occur from rising concentrations of atmospheric greenhouse gases (Wigley and Barnett 1990; Gates et al. 1992), may be sought in the historical climate record of the last 100–150 yr. To date, global-average temperature series show increasing trends, with highest annual mean temperatures mostly in the last decade, but at this stage the trends cannot be ascribed directly to greenhouse gas increases (Folland et al. 1990; Folland et al. 1992). Among other reasons, the global series exhibit significant variation on annual to decadal scales, the sources of which are not understood, though much of it may be due to forcing by tropical ocean temperature variation (Graham 1995).

Temperature trends are expected to vary systematically between regions, in response to global greenhouse gas increases, industrial sulfate aerosol emissions in Northern Hemisphere continental regions (Isaksen et al. 1992; Taylor and Penner 1994; Jones et al. 1994), and cloudiness changes (Karl et al. 1993). The regional scale is of particular interest to nations and economic groups and often has the homogeneous historical datasets needed for climate change research. Regional variations in temperature trend are widely evident in data (Folland et al. 1990, 1992; Parker et al. 1994; Salinger et al. 1995) and can provide a basis for testing trend signatures predicted by climate models.

Trend detection at the regional scale is complicated, however, by the relatively greater variation associated with natural climatic processes, such as the El Niño–Southern Oscillation (ENSO) phenomenon (Jones 1989; Salinger et al. 1995; Basher and Thompson 1996). This, together with the smaller datasets, limits the statistical significance of regional trend estimates relative to those for global trend estimates. It is not uncommon for both global and regional temperature trends to be estimated from a comparison of the averages for the decades at either end of the record (Folland et al. 1992; Salinger et al. 1993; Parker et al. 1994) or to be taken as self-evident from the generally rising pattern evident in plots. One of the aims of this paper is to assemble a systematic statistical approach to trend estimation that includes significance testing and enables better detectability in regional series.

In most previous statistical modeling studies, it is generally assumed that the annual mean temperature time series is made up of a long-term trend component and a residual component that is often assumed to be white noise (Wigley and Jones 1981; Jones 1989). If the residuals are serially correlated, this will result in the overestimation of the effective sample size of the residuals (Leith 1973; Trenberth 1984), and hence overestimation of the significance of a trend. Recently, some advanced studies (Bloomfield 1992; Bloomfield and Nychka 1992; Woodward and Gray 1993) have introduced regression models with stationary and serially correlated residuals to address the problem. Continuing this approach, we set out procedures for selecting the optimum residual model for the dataset under consideration. We consider linear and nonlinear trends.

The fundamental statistical models we use are regressions with serial correlation (Kohn and Ansley 1985; Bloomfield 1992; Bloomfield and Nychka 1992; Woodward and Gray 1993). These models are not as popular as least squares regression models, perhaps because the corresponding statistical inferences and computational routines are not well known. For this reason, we briefly introduce the statistical inferences of the model and point to suitable computational programs available in the advanced statistical package S-PLUS (Chambers and Hastie 1992; StatSci 1993). A set of steps is provided to lead the reader through the required procedures.

As already noted, some of the interannual variability at the regional scale is known to be related to readily observed factors, such as the Southern Oscillation or regional circulation anomalies. In principle, if these factors are unrelated to long-term change and forcing factors, and are themselves free of measurement error, they can be modeled to reduce the variability of the series. Statistically, this would increase the ratio of variation of trend to variation of residuals and improve the confidence interval of the trend estimate. In fact, since the Southern Oscillation and circulation anomalies cannot be assumed to be free of long-term trend or measurement error, it is necessary to apply a high-pass filter to their series before developing the model, in order to avoid imposing erroneous trend on the series being studied (the implications of filtering will be further discussed in section 2). Jones (1989) applied this idea by removing the influence of a filtered series of the Southern Oscillation index (SOI) on his global temperature series, before undertaking trend analysis. To identify the role of external influences on temperature trends, Richards (1993) and Visser and Molenaar (1995) directly included in their models unfiltered explanatory variables, such as enhanced CO_{2}, tropospheric sulphate pollution, and stratospheric volcanic aerosol.

The geographical focus of this paper is the New Zealand region (Fig. 1). Annual mean series of national-composite daily surface air temperature (mean, maximum, minimum, and diurnal range), and regional-mean sea surface temperature (Fig. 2) are analyzed. Statistically significant trends are identified and are interpreted in the light of global climate change science. The South Pacific region is one of the few areas of the vast oceanic areas of the Southern Hemisphere for which there are adequate datasets. Urbanization is relatively limited, and the region is relatively free of air pollution, being remote from the dominant sources of anthropogenic sulphate aerosols and having very little local industrial pollution. Some aspects of global climate change might therefore be more readily evident in these datasets. Recent studies have established the general patterns of trend for New Zealand and the South Pacific (Salinger et al. 1993; Salinger 1995; Salinger et al. 1995; Folland and Salinger 1995). An important feature of the region is the strong relationship of temperatures at monthly and interannual timescales with the Southern Oscillation and associated regional circulation anomalies (Trenberth 1976; Gordon 1986; Basher and Thompson 1996).

## 2. Statistical models

The fundamental statistical model used in this study is

where {*y*_{t}} are annual means of temperature; index *t* runs over *T* yr; *μ* is the temperature normal over these years, *f** _{t}*;(

*λ*

_{l}, . . . ,

*λ*

_{r}) is a function of

*t*with parameters, {

*λ*

_{l}, . . . ,

*λ*

_{r}} representing the trend to be detected,

*r*is the number of parameters; {

*x*

_{i,t},

*t*= 1, . . . ,

*T*} is the

*i*th explanatory variable, which is observable and noncolinear with

*f*

_{t};

*k*is the number of explanatory variables,

*β*

_{i}is the coefficient for the

*i*th explanatory variable, and {

*ε*

_{t}} is the residual time series, which is an autoregressive-moving average process of order (

*p*,

*q*) [ARMA(

*p*,

*q*), Box and Jenkins (1976)]; that is,

where *p* and *q* are nonnegative integers, {*ϕ*_{l}, . . . , *ϕ*_{p}} are the autoregressive coefficients, {*θ*_{1}, . . . , *θ*_{q}} are the moving average coefficients, and {*η*_{t}} is a white noise process with variance *σ*^{2} (the innovation variance).

### a. Trend term

The trend *f*_{t} can be in a very general form (Bloomfield 1992). In the case of trends related to enhanced greenhouse gas forcing, the representations that are of most interest are linear trend

and exponential trend

For diurnal temperature range, we find the piece-wise two-section linear trend is of interest:

where t_{0} is the turning point.

### b. Explanatory variables

If there is a significant response of the regional temperature {*y*_{t}} to interannual regional circulation anomalies and this response is not modeled, the ratio of trend signal to noise is reduced and a real trend, if present, may be not detectable. However, where the circulation influence can be represented by a linear response to some measurable explanatory variable (e.g., the SOI), it is possible to include the variable in the model to reduce the noise level and thus improve the quality of the estimated trend. All such explanatory variable series must be subjected to high-pass filtering prior to doing the regression model development and analysis, to ensure the estimated trend is uniquely represented by *f*_{t} without influence from low-frequency variability (or systematic error) in the explanatory variables. A filter cutoff at about 10 yr is suitable to distinguish between the long-term variability of {*y*_{t}}, which is relevant to detecting the effect of enhanced greenhouse warming and sulfate aerosol cooling (Wigley and Raper 1990), and the shorter-term variability associated with the Southern Oscillation and related circulation anomalies. Filtering does not allow any estimate of how *f*_{t} depends on the explanatory variables, but this is not a problem for us as our aim is to determine a statistically significant trend estimate for the unfiltered measured temperature series.

### c. Residuals

Residuals characterize the variability not explained by the trend and the explanatory variables. To ensure that the trend is represented only by *f*_{t}, it is important to confine the residuals to be a stationary process, because nonstationary processes, such as the autoregressive-integrated-moving average process [ARIMA(*p,d,q*), *d* = 1, 2, . . . , that is, the *d*-order difference series of {*y*_{t}} being an ARMA( *p,q*) process (Box and Jenkins 1976)], may include random trends (Woodward and Gray 1993). ARMA( *p,q*) processes are stationary, short-range correlated normal processes. White noise residuals [ARMA(0,0); e.g., Jones 1989] and red noise residuals [ARMA(1,0), Wigley and Raper 1990] are the most commonly used, though Bloomfield (1992) demonstrated that *p* > 1 is more appropriate for the residuals {*ε*_{t}} of the IPCC and Hansen–Lebedeff global temperature series. Bloomfield also showed that an alternative candidate is the stationary, autoregressive, fractionally integrated moving average process [ARIMA( *p,d,q*), Grainger 1980]; that is,

where 0 < *d* < 0.5 is the differencing parameter and the other symbols are as defined in (2), but no statistical test is available to compare the relative quality of the ARIMA( *p,d,q*) and ARMA( *p,q*) options. At the regional scale, the greater annual variability in temperature series means that long-range correlation will be less evident and hence that the residuals can be satisfactorily characterized by an ARMA( *p,q*) process with small *p,q.*

Visser and Molenaar (1995) further decomposed the residuals into an ARMA process {*ε*_{t}} and an independent white noise process {*ξ*_{t}}, but the meaning of {*ξ*_{t}} is not well explained. In fact {*ξ*_{t}} mainly represents the *climate noise* generated by daily weather variability [see Leith (1973) for a more general discussion] and the variance of {*ξ*_{t}} can be estimated from the daily observations (Zheng 1996). When monthly averages or seasonal averages of daily temperature are considered, {*ξ*_{t}} is often significant and cannot be omitted, but for annual averages of daily temperature the magnitude of {*ξ*_{t}} is relatively small. Furthermore, the climate noise will be correlated with regional circulation anomalies and will be partially explained by the fitted explanatory variables. For these reasons, {*ξ*_{t}} is omitted in this paper.

## 3. Fitting procedure

In this section, we describe how to fit the basic statistical model to observations, with reference to the statistical software package S-PLUS (StatSci 1993).

### a. Remove low-frequency variability from candidate explanatory variable series

Filter the candidate explanatory variable series, for example, by S-PLUS routine “filter,” where the 13-term high-pass Gaussian filter (with a 50% response at 10 years) is constructed by S-PLUS routine “demod.”

### b. Identify order ( p,q ) for residuals {*ε*_{t}}

Keep all candidate explanatory variables in model (1). For an integer *m* (≥*p*), define the Akaike Information Criterion (AIC, Akaike 1977) for (1) as

where {*ε*_{t}} is a realization of an ARMA( *p,q*) process; *L*(*ε*_{m+1}, . . . , *ε*_{t} | *ε*_{1}, . . . , *ε*_{m}) is the conditional likelihood of *ε*_{m+1}, . . . , *ε*_{t} given *ε*_{1}, . . . , *ε*_{m} and is a function of *μ*; *λ*_{1}, . . . , *λ*_{r}; *ϕ*_{1}, . . . , *ϕ*_{p} and *θ*_{1}, . . . , *θ*_{q} [for the detailed form of L, see StatSci (1993), Eq. (16.46)]; and the maximum is taken over all the parameters. The order ( *p,q*), which corresponds to the lowest AIC, is the order to be selected.

We set *m* = 10 as the initial value. In our experience, this is suitable for most annual means of temperature, even for global series. However, if the model obtained from the initial trial has high autoregressive order, that is, *p* close to 10, then *m* would need to be increased to obtain the optimal model. The S-PLUS routine “arima.mle” is used to calculate the AIC. For a given *p*, “arima.mle” calculates AIC(*y*_{p+1}, . . . , *y*_{t} | *y*_{1}, . . . , *y*_{p}), which means that the first *m*–*p* observations have to be removed in order to get AIC(*y*_{m+1},..,*y*_{t} | *y*_{1}, . . . , *y*_{m}). It should be noted that “arima.mle” is applicable only to linear *f*_{t} with ARMA( *p,q*) residuals. However, if *f*_{t} is nonlinear, the AIC can be estimated by the following two steps. First, fix *λ*_{1}, . . . , *λ*_{r}, consider the statistical model

and estimate its AIC [denoted as AIC(*λ*_{1}, . . . , *λ*_{r})] using “arima.mle.” Second, calculate the AIC for model (1) by minimizing AIC(*λ*_{1}, . . . , *λ*_{r}) over all *λ*_{1}, . . . , *λ*_{r} using S-PLUS routine “nlminb.”

The selected ( *p, q*) can be further judged by checking the autocorrelation function, the partial autocorrelation function (Milionis and Davies 1994), and the smoothed periodograms of the residuals (Bloomfield and Nychka 1992) using S-PLUS routines “arima.diag” and “spec.pgram.”

### c. Select explanatory variables

Fix the order ( *p,q*) identified in section b. The explanatory variables that correspond to the lowest AIC are selected and are denoted as {*x*_{1,t}, . . . , *x*_{k′,t}}, where *k*′ ≤ *k*2. Then the statistical model is identified as

where {*ε*_{t}} is an ARMA( *p,q*) process. All the parameters can be estimated by “arima.mle.”

### d. Test the significance of the trend

Suppose *L*_{n}(*μ*; *β*_{1}, . . . , *β*_{k′}; *ϕ*_{1}, . . . , *ϕ*_{p}; *θ*_{1}, . . . , *θ*_{q}) is the likelihood function for the model

and *L*_{a}(*μ*; *λ*_{1}, . . . , *λ*_{r}; *β*_{1}, . . . , *β*_{k′}; *ϕ*_{1}, . . . , *ϕ*_{p}; *θ*_{1}, . . . , *θ*_{q}) is the likelihood function for the model (9). Then the Neyman–Pearson statistic (Rao 1973) is defined as

Under the null assumption *f*_{t} = 0, RP is distributed asymptotically as a chi-square distribution with *r* degrees of freedom [*χ*^{2}(*r*)]. If the estimated RP is greater than the 100-*α* percentile of *χ*^{2}(*r*), the trend *f*_{t} is significant at the *α* percent level.

### e. Calculate confidence intervals for the linear trend

Consider the linear trend model

where {*ε _{t}*} is an ARMA(

*p,q*) process. Let {

*μ̂*,

*λ̂*

_{1},

*β̂*

_{1}, . . . ,

*β̂*

_{k′};

*ϕ̂*

_{1}, . . . ,

*ϕ̂*

_{p};

*θ̂*

_{1}, . . . ,

*θ̂*

_{q}} be the estimated parameters. Then the covariance matrix of {

*μ̂*,

*λ̂*

_{1},

*β̂*

_{1}, . . . ,

*β̂*

_{k′}} can be estimated as

(Jones 1992), where **V**, the autocovariance matrix of {*ε _{t}*}, can be estimated by solving Yule–Walker equations (Box and Jenkins 1976) with coefficients {

*ϕ̂*

_{1}, . . . ,

*ϕ̂*

_{p};

*θ̂*

_{1}, . . . ,

*θ̂*

_{q}} and

**X**is a

*T*× (

*k*′ + 2) matrix whose (

*t, i*)th entry is

*x*if

_{i,t}*i*≤

*k*′, or 1 if

*i*=

*k*′ + 1 or

*t*if

*i*=

*k*′ + 2. Inverse matrices can be calculated by Choleski decomposition using the S-PLUS routine “chol.”

The 100-*α* percent confidence interval for *λ*_{1} can be constructed as *λ̂*_{1}±*σ̂*_{k′+2}t_{α/2}(*T*), where t_{α/2}(*T*) is the 100-*α*/2 percentile of the *t* distribution with *t* degrees of freedom and *σ̂*_{k′+2} is the square root of the (*k*′+2)th diagonal element of **Σ**.

## 4. Data

### a. Temperature series for trend detection

#### 1) New Zealand air temperatures

Composite New Zealand annual mean series (AT, MAXT, and MINT) of daily mean, daily maximum, and daily minimum temperature, respectively, were formed from a nationally representative set of seven long-term measurement sites. The stations and their starting years (Salinger 1981; Folland and Salinger 1995) are Auckland, 1855; Masterton, 1908; Wellington, 1863; Nelson, 1863; Hokitika, 1863; Lincoln, 1864; and Dunedin, 1853. Records were not taken between 1881 and 1893 at Hokitika or between 1881 and 1905 at Nelson. The annual mean series were formed by unweighted combination of the individual series’ anomalies relative to their 1951–80 normals, added to the average of these normals. Considerable work has been done to assess the quality of New Zealand data and correct for site changes and other problems (Rhoades and Salinger 1993; Salinger et al. 1993). Stevenson screens were introduced from around 1869, and urbanization effects over the course of the twentieth century on the national composite series are likely to be very small (Hessell 1979; Folland and Salinger 1995). New Zealand’s size and unique position within the midlatitude oceanic westerly wind belt ensures a well-ventilated windy regime leading to negligible differences between rural and urban temperatures. Studies elsewhere (Jones et al. 1989) indicate that urbanization accounts for only a very small proportion of the observed warming trends in Northern Hemispheric temperature averages. An annual mean series of diurnal temperature range (DTR) was formed by the difference between MAXT and MINT (Fig. 2).

The incompleteness of the composite temperature series before 1908 is not believed to be a problem for the analysis of trend in AT, MAXT, and MINT, since the contributing station series are each normalized to their 1951–80 normals, the spatial correlation in New Zealand temperatures is high, and the remaining stations retain a representative distribution. The pattern of variation for AT, MAXT, and MINT appears to be stable throughout the period. However, the individual station DTR series are not normalized and therefore the composite DTR series will exhibit shifts in mean when a station is added or removed. In particular, the addition of Nelson in 1905 and Masterton in 1908 will raise the composite mean DTR, since both have relatively high DTR. A sample calculation on the 1950–89 period showed that this effect amounts to about 0.7°C. For this reason the DTR trend analysis is confined to the homogeneous period from 1909 onward. The substantial reduction in all the temperature series in the early 1990s is associated with the 1991 Mt. Pinatubo eruption and the prolonged 1991–94 El Niño–Southern Oscillation episode (Folland and Salinger 1995; Basher and Thompson 1996).

#### 2) Sea surface temperatures

Following Basher and Thompson (1996), an annual series of regional mean SST anomalies was formed for the area 30°–50°S and 160°E to 175°W (Fig. 1) based on the 5° lat × 5° long data in the United Kingdom Meteorological Office ATLAS7 global dataset. Detailed discussions of the SST data resources and their accuracy have been given by Folland et al. (1992) and Parker et al. (1994) for the global database and by Folland and Salinger (1995) for the New Zealand region. Systematic errors of the order of 0.1°–0.2°C may affect the data and likewise any trends estimated from them.

### b. Explanatory variable series

#### 1) Southern Oscillation index

The SOI is taken as the mean sea level pressure difference between Tahiti and Darwin normalized to unit standard deviation (Troup 1988). Annual averages of monthly values of the SOI using a 40-yr normal base of 1941–80 were computed and are shown in Fig. 3 for the period 1860–1994. The trend analysis used a filtered version, soi, which is essentially free of low-frequency (≥10 yr periods) variations.

#### 2) Circulation indices

Trenberth’s (1976) indices of circulation anomalies are based on station mean sea level pressure (MSLP). An index for a pair of stations is simply the monthly average MSLP difference between the stations, minus the long-term mean of this difference calculated over a normal period, usually 1951–80. A nonzero index implies an anomalous geostrophic wind normal to the pressure gradient anomaly. To represent the broad circulation anomaly patterns, we have chosen two approximately orthogonal indices, namely Z1 (Auckland–Christchurch, Fig. 1), which represents zonal flow anomalies across most of the length of New Zealand, and M1 (Hobart–Chatham Islands), which represents meridional flow anomalies for a wide region from Australia to the oceanic areas east of New Zealand. The indices are available from 1896 onward (Fig. 3). The trend analysis uses high-pass filtered versions, m1 and z1. There is evidence of observation errors in the early pressure data for some stations and hence errors in Z1 and M1 for the earlier decades. No attempt at correction has been made as little is known about the type of barometers used during these periods. Such measurement errors will not cause error in our trend estimates, owing to the use of the high-pass filter, but they may slightly increase the variance and hence increase the confidence intervals of the trend estimates.

## 5. Fitting models to data

The procedures described in section 3 were first applied to the mean temperature series AT, with the linear trend option. The model details and related mathematical considerations are as follows.

Step (a). The chosen high-pass filter is applied to SOI, Z1, and M1 to produce series soi, z1, and m1. The correlations of soi, z1, and m1 with time are small and not significant indicating that there is little colinearity between linear trend and the explanatory variables.

Step (b). ARMA( *p,q*) models with *p* ≤ 10, *q* ≤ 5 were examined, but only the results for white noise ARMA(0,0) residuals and red noise ARMA(1,0) residuals are worth listing (models 1 and 2 in Table 1). On the basis of lowest AIC, we find that the residuals of the AT series are an ARMA(1,0) process and hence that serial correlation in the New Zealand temperature series is significant.

Woodward and Gray (1993) showed that if {*y*_{t}} is modeled as a trend with ARMA residuals, but is actually an ARIMA process, then erroneous trends are likely to be detected. They later developed a bootstrap approach to discriminate between these two models, showing that the ARIMA model is statistically preferred for the main global temperature series, and that on this basis the significance of the trends in these series cannot be established (Woodward and Gray 1995). Returning to our regional series, if the process is ARIMA( *p*,1,*q*) and the ARMA( *p* + 1,*q*) model is fitted to the series, its characteristic equation in *x*

where *ϕ̂*_{i} are the estimated autoregressive coefficients, is likely to have a near-unit root *x*_{1}; that is, |*x*_{1} − 1| < 0.2. However, calculations on AT without trend show that for *p* ≤ 10, *q* ≤ 5 there are no near-unit roots for the residual model ARMA( *p,q*), and therefore the ARIMA( *p*,1,*q*) process is not likely in AT.

Woodward and Gray (1993) examined several methods to determine linear trend, and carried out simulations of each to determine the probability of incorrect detection of trend, when no trend is present. They showed that excessive significance is a serious problem when a unit root or very near unit root is present, and that better results are obtained when autocorrelation is modeled. We have carried out parallel simulations (Table 2), which show that where the trend is estimated by our proposed method, the probability for incorrect detection of trends is lower than for the two methods (WG3 and WG4) proposed by Woodward and Gray (1993), though the problem of excessive significance for the unit root or very near unit root case remains. In the case of AT, the fitted first order autocorrelation is *ϕ̂*_{1} = 0.42, so the root of the characteristic equation (1/*ϕ̂*_{1}) is sufficiently far from unity and our significance test is valid.

Bloomfield (1992) showed that for the IPCC and Hansen–Lebedeff series, the AIC for linear trend and stationary ARIMA( *p,d,q*) (i.e., 0 < *d* < 0.5) residuals (long memory) is less than the AIC for linear trend with ARMA( *p,q*) residuals (short memory). For AT, the AIC for the long memory model (using S-PLUS routine “arima.fracdiff”) is more than the AIC for the short memory model. However to the best of our knowledge, there is not a theoretical basis for using AIC to compare these two different models. In our regional series, if the process is trend plus long memory residuals but we fit a model of trend plus ARMA( *p,q*) residuals, then the estimated *p* is likely to be large. Since we find *p* = 1 for the AT series, we conclude that AT does not contain a long memory residual process.

Step (c). Table 1 shows that for AT the significant explanatory variables are soi and m1 (but not z1). The optimal linear trend model for AT is model 6. We find that the use of the explanatory variables provides not only a lower AIC, but also an estimated autocorrelation function that is closer to the theoretical function for the derived ARMA(1,0) model. This tends to support the appropriateness of using explanatory variables.

Step (d). Table 1 also shows that the −2 ln likelihood ratio for model 6 against model 9 (for the selected explanatory variables but no trend) is 20, which passes the 99.9 percentile of the *χ*^{2}(1) distribution. The linear trend is therefore significant at the 0.1% level.

By repeating steps (a)–(d) with the exponential trend model fitted to AT, we find that the exponential trend is also significant. The AIC for the optimal exponential model (44.4) is similar to the AIC for the linear model 9 (44.2), which means that the exponential model is an acceptable alternative for AT. It also indicates that over the 1896–1994 period the temperature trend itself may be rising. Although the exponential model is as valid as the linear model, it has no statistical advantage and is not used any further in the present study.

Step (e). For the preferred model 6, the 95% confidence interval for the linear trend is 0.11 ± 0.035°C decade^{−1}. The linear trends for the other models are relatively insensitive to the choice of residual process and all fall in the above range. The linear trends are also insensitive to the choice of explanatory variable set, owing to the removal by the high-pass filter of essentially all long-term variation in the explanatory variable series. The advantage of using the soi and m1 explanatory variables is shown by the 15% reduction in standard deviation of model 6 relative to model 7, that is, by the 15% reduction in the trend estimate’s confidence interval.

Comparison of the standard deviations of the AT trend coefficients shows that the neglect of serial correlation (model 1), which is a common practice, would underestimate the confidence interval by 36% when explanatory variables are used (model 2), and by 30% when no explanatory variables are used (model 8 vs model 7). It is likely that similar underestimation will occur in other regions of the globe, especially where oceanic or other slowly varying climate processes confer interannual persistence on temperature series.

The above procedures have been individually applied to MAXT, MINT, DTR, and SST, and also to the SST–AT difference series. All the series are best represented by the ARMA(1,0) model. The soi and m1 explanatory variables are required for all the maximum, minimum, and mean air temperature models (though m1 is not available prior to 1896). However, neither explanatory variable is needed for DTR, presumably because the impacts of soi and m1 are similar for MAXT and MINT.

## 6. Results and discussion

The trend results for all series are listed in Table 3. The first main result is that the New Zealand mean annual air temperature series exhibits a highly statistically significant, positive linear trend. The linear trend estimate for AT (1896–1994) is 0.11 ± 0.035°C decade^{−1} (95% confidence level). To allow for comparison with global trend estimates, calculations are presented in Table 3 for the periods used by Bloomfield (1992) for the IPCC series (1861–1989, 0.037 ± 0.010°C decade^{−1}), and for the Hansen–Lebedeff series (1880–1987, 0.055 ± 0.015°C decade^{−1}). The New Zealand trend is about twice the trend in these global series. The estimated trend in AT is dependent on the period examined, which is to be expected given the long-term variations seen in Fig. 2, but all are consistently upward.

Our analysis cannot identify the source of the trends detected, but some general comment is possible. First, although the dataset is thought to be of reasonably high quality and to be free of significant false trends, systematic measurement error cannot be ruled out. Second, long-term natural changes in climatic processes are possible, especially in the Pacific basin where long timescale oceanic processes may be present. There is some evidence in Fig. 3 of multidecadal trend in the unfiltered regional circulation indices M1, Z1, though these may be due to measurement error. Third, the trend value of 0.11°C decade^{−1} is very similar to the lower rates of global temperature rise predicted by greenhouse gas forced transient GCMs (without aerosol effects) for the first few decades of the 6–10 decades needed to reach a doubling of CO_{2} (Gates et al. 1992). However, these models also predict that the warming trends for oceanic areas and especially the Southern Hemisphere will be smaller, not larger, than those of the more continental Northern Hemisphere, owing to the sequestration of heat by the oceans (Gates et al. 1992). The smaller Northern Hemisphere trend may be due to regional sulfate aerosol pollution, which is expected to cause a cooling effect that may substantially offset the predicted heating due to enhanced greenhouse gases, both through direct radiative effects (Taylor and Penner 1994), and through indirect effects on cloud nucleation and cloud characteristics (Jones et al. 1994).

The second main result concerns statistically significant trends in maximum temperature, minimum temperature, and diurnal temperature range. Taken over the whole period 1896–1994, the upward trend in the MAXT and MINT are similar (0.09 and 0.10°C decade^{−1}). The trend in DTR for the available homogeneous period 1908–94 is slightly downward (−0.03°C decade^{−1}). However, over the more recent period, 1951–90, the trend in maximum temperature becomes statistically insignificant (0.03 ± 0.13°C decade^{−1}), while the trend in minimum temperature remains high and statistically significant (0.13 ± 0.10°C decade^{−1}; the largest of the trends found in our study). Correspondingly, the DTR trend over 1951–90 reaches −0.10 ± 0.04°C decade^{−1}, which is also statistically significant. Salinger (1995) identified a DTR decline of −0.24°C from 1941–90 in his New Zealand Region T3 series, and found some indication of seasonal and orography-related variation. Figure 2 indicates that the DTR decline is part of a relatively recent systematic multidecadal nonlinear variation. Over the period 1909–50 the DTR trend is not statistically different to zero. The exponential trend model does not suit the 1909–94 DTR dataset, but the piece-wise two-section linear trend model, chosen to have a turning point at 1950 and no trend in the first section, was found to be significant at the 1% level using the *χ*^{2} test.

The diverging trends in MAXT and MINT for 1951–90 are consistent with the IPCC conclusion that “the observed global warming over the past several decades is primarily due to an increase in daily minimum temperatures with little contribution from daily maximum temperatures” (Folland et al. 1992), and the corresponding DTR trend is similar to those reported by Karl et al. (1993) for various other regions. The cause of this DTR behavior is not yet clear. Karl et al. argue that it is not expected as a result of enhanced greenhouse gas concentrations and that the preferred explanation is increasing cloudiness. The global extent of the DTR response indicates that it cannot be a direct radiative response to regional sulphate aerosol pollution, though aerosol-induced changes in cloud condensation nucleii concentrations and droplet distributions (Jones et al. 1994) may affect cloud amounts elsewhere. Cloudiness is difficult to define and measure accurately, but studies to date indicate that it is increasing, mainly in Northern Hemisphere continental regions (Folland et al. 1990, 1992). No results on cloudiness trend are available for the New Zealand region. Australian datasets show a clear tendency toward cloud cover increases (252 out of 318 stations) but no firm conclusion on trends is possible (Jones and Henderson-Sellers 1992).

The third main result concerns the smaller trend for regional SST relative to that for national mean air temperature. The SST trend estimated for 1928–94 was statistically significant at 0.07 ± 0.06°C decade^{−1}. This is noticeably smaller than the AT trend for the same period, although the trends are not statistically different. To pursue this issue, we formed a series of the SST-AT difference and examined this for trend. The result was statistically significant at −0.06 ± 0.05°C decade^{−1}. The smaller confidence interval for the SST–AT trend relative to the SST and AT trend means there is unmodeled short-term climatic variation that is correlated between the SST and AT series. Because the available SST data density increased markedly after about 1950, further analyses were done for the more reliable but shorter 1951–90 period. The resulting trends of 0.02°C decade^{−1} for SST and 0.10°C decade^{−1} for AT are not individually significant, but the trend of −0.07 ± 0.06 in SST–AT is statistically significant. (Again, there is a substantial reduction in variance in the SST–AT trend, relative to the individual SST and AT trends.) The similar SST–AT trends for the two periods, 1951–90 and 1928–94, may suggest the source processes are relatively stable and long term. However, given the small sample size and the marginal significance of the trends, some caution is warranted in the interpretation of the trends. On this point, to test the validity of the significance test used, a simulation study using the S-PLUS routine “arima.sim” was performed on 500 ARMA(1,0) series with the sample size (40), autoregressive coefficient (0.35), and innovation variance (0.01875°C^{2}) derived from the SST–AT series. This showed that, at the 5% significance level, the rate of false positive trend detection using our method was less than 5%, which confirms that the sample size and test approach are acceptable.

There is a possibility that systematic error in the SST data may be a factor in this result, though the main changes in SST observation (Folland at al. 1990) would tend to make a positive rather than negative contribution to the SST trend. If the SST–AT trend is real, and the AT data can be taken as representative of the surrounding marine air temperature (perhaps with a constant bias) then for this region we can conclude that the atmosphere and ocean are not in stable thermal equilibrium on the multidecadal scale, and that there is a long-term atmospheric warming process, which is causing additional trend in air temperature beyond that expected from the direct “slave” response of air temperature to SST trend. We cannot draw a conclusion on the size of atmospheric contribution to the SST trend as for this limited region there may be long-term oceanic variation unrelated to atmospheric forcing. We can also surmise that the changing difference between air and sea surface temperature will result in changing ocean–atmosphere heat fluxes, either as an increasing flux from the atmosphere to the ocean or as a reducing flux from the ocean to the atmosphere. Since the difference between marine air temperature and SST in this midlatitude region is small on average, the 0.5°C net change in SST–AT over the period 1928–94 will represent a sizeable change in this difference and hence potentially in heat fluxes (assuming little change in the mechanics of surface exchange). This, in turn, suggests that there may be a large trend in atmosphere–ocean net heat fluxes for this region of the globe, the direction of the change in flux being toward the ocean.

A final general point worth noting is the potential for improved detection capability using paired quantities, such as occurred with our DTR and SST–AT trend analyses. First, there is the possibility of obtaining information on climatically important quantities, for example, the unknown atmospheric radiative process in the DTR case, and the atmosphere–ocean heating rates in the SST–AT case. Second, there is the possibility of signal to noise gains. Whether a net gain occurs will depend on the correlation between the two series’ signals, the correlation between the two series’ unmodeled variation, and the magnitudes of the random noise components. Differencing will reduce the effect of unmodeled variations that are positively correlated (e.g., as in MAXT and MINT), but it also will increase the random component and reduce the size of the trend signal if the component signals in the two series are positively correlated. Differencing may prove advantageous where specific climate trend signatures are expected, for example, in latitudinal and vertical temperature gradients, and coastal-interior contrasts.

## 7. Conclusions

Statistically significant trends have been detected in a set of regional-average temperature series, through the application of a systematic statistical approach to choosing optimal trend models, which simultaneously deals with long-term trends, serially correlated residuals, and short-term regional climatic influences.

The trends in the New Zealand national-mean air temperature and the regional mean SST show interesting features that appear to have relevance to global change issues. The trends are generally consistent with the IPCC scenario of enhanced greenhouse forcing in which temperature trends in the oceanic midlatitude Southern Hemisphere are relatively modest owing to sequestration of heat by natural oceanic processes, and in which the expected larger temperature trends in the Northern Hemisphere are masked by atmospheric cooling arising from some other factor, such as anthropogenic sulfate aerosol pollution. The results also support the view that there is another major atmospheric change factor of global extent, possibly increasing cloudiness, that is reducing the diurnal temperature range.

## Acknowledgments

This work was supported by the New Zealand Foundation for Research, Science and Technology under Contract CO1308. We wish to thank Dr. Jim Salinger for the provision of the homogenized temperature data series and other advice; Drs. Richard Katz, Peter Thomson, and Laimonis Kavalieris for discussions on statistical models; and the referees for their advice on statistics methodology issues.

## REFERENCES

_{2}-induced climate change. Nature, 292, 205–208.

## Footnotes

*Corresponding author address:* Dr Xiaogu Zheng, NIWA, P.O. Box 14-901, Wellington, New Zealand.

Email: x.zheng@niwa.cri.nz