## Abstract

The purpose of the present work is to demonstrate that a solar cycle response exists in surface temperature using the longest global dataset available, which is in the form of 1854–2007 sea surface temperature (SST), with an emphasis on methods and procedures, data quality, and statistical tests and the removal of deterministic signals, such as volcano aerosol forcing and greenhouse gas warming. Using the method of composite-mean difference (CMD) projection, signals of warming during solar maximum and cooling during solar minimum years are found in the global SST over the 14 cycles, dispelling previous claims that the solar cycle response before 1920 is opposite to that of the modern era. The magnitude of the solar cycle response averaged over the oceans between 60°S and 60°N is about 0.1°C of warming for each W m^{−2} variation of the solar constant (but is slightly lower, at ~0.085°C, when periods of suspected bad data are averaged in, which is consistent with previous work). The signal is robust provided that the years near the Second World War are excluded, during which transitions from British ships to U.S. ships introduced warm bias in the SST, as recently pointed out by D. Thompson and his colleagues. Monte Carlo tests show that the extracted signal has less than 0.02% chance of being a random occurrence. This establishes the existence of a solar cycle response at the earth’s surface at high statistical confidence. Contamination of the signal by volcano aerosols is estimated using the multiple CMD inversion method and found to be small over this long record, although ENSO contamination varies depending on the period chosen but is also small.

The multidecadal trend of response to solar forcing is found to account for no more than a quarter of the observed warming in SST during the past 150 yr, under a reasonable but unproven assumption that the climate response to secular solar forcing and to solar cycle forcing has the same spatial pattern.

## 1. Introduction

The sun’s radiant output varies quasi-periodically on a 10–11-yr time scale. In its active phase, called the solar maximum (max), the sun has more dark sunspots and accompanying bright faculae. The magnitude and indeed even the sign of this variation on the solar constant [the total solar irradiance (TSI)] were uncertain until the advent of satellites in 1979, when direct measurement above the earth’s atmosphere became feasible. Using sunspot and other proxy indices, the variation of the TSI can be extended using solar models back to the seventeenth century. The controversy concerning the TSI reconstruction is related to the secular trend of the TSI and generally is not about the classification of whether a year belongs to solar max or solar minimum (min) (see, e.g., Scafetta and Willson 2009). Only this latter minimal information is used in the present work. The terrestrial response to this variable forcing is more controversial, especially with regards to the temperature at the surface. Historically there were debates as to whether the earth was warmer or colder during the solar max as compared to the solar min. Although we previously found using modern temperature records that the global-mean temperature is warmer during solar max, there were controversial reports that perhaps in an earlier epoch the response was opposite. For example, the literature reviewed by Hoyt and Schatten (1997, chapter 5) suggests that the surface temperature is negatively correlated with the TSI during the period 1800–1920 and positively correlated from 1920 to the present, and a sign reversal was observed in the apparent dependence of water levels in Lake Victoria around 1920 (Clayton 1940). This phase reversal, if true, is difficult to understand on physical grounds and makes the search for the mechanism of the solar cycle response more elusive. One possibility could be that our sun is at the borderline between overcompensation and undercompensation of the dimming effect of the sunspots by the brightening effect of the faculae. However, modern reconstructions of the TSI (e.g., Lean 2005; Lean and Rind 1998; Lean et al. 1995) do not show this reversal between TSI maximum and sunspot number maximum.

When dealing with historical data, a major problem is that of data quality, especially during periods of world wars. Camp and Tung (2007a) and Tung and Camp (2008) found a statistically significant global temperature warming at the surface (land plus ocean) during solar max in two reanalysis datasets since the late 1950s, by which time some of the data problems likely had been corrected. Tung et al. (2008) additionally found a similar response in the two in situ data records during the same period. Questions remain concerning the existence of the solar cycle response at the surface in earlier decades and in century-long records. A simple extension of our previous work, which was done for the period from 1950s on, to earlier periods immediately runs into the period of World War II (WWII), when the data were problematic, as pointed out recently by Thompson et al. (2008).

Although “global” surface temperature datasets are available that start from 1880, large continental areas have missing coverage, with the exception of parts of North America, Europe, and Japan. Some datasets fill in the missing data using various methods, as reviewed in Tung et al. (2008). Generally, the solar cycle signal obtained by composite-mean difference (CMD) is smaller in areas where the missing data were filled in, as various interpolation schemes tend to reduce the anomaly to varying degrees. Recent satellite data (used in reanalysis) show that larger responses tend to occur over continents relative to the oceans and that they are larger over the Arctic and Antarctic relative to the tropics. Since these higher-response regions are the ones more likely to have experienced severe missing data in the long-term record, it is expected that the global mean signal in the long-term historical record with missing data is smaller than what could have been found in a geographically complete data record.

A related issue on the existence of the solar cycle response is the fact that there were major volcano eruptions that happened to be spaced on a decadal scale during the recent period: Agung in 1963, El Chichon in 1982, and Pinatubo in 1991. Previously, using 100 yr of surface temperature data and optimal filters constructed using a two-dimensional energy balance model, North and Stevens (1998) found that the volcano signal contributed significantly to the decadal peak in the climate signal spectrum. Such a contamination prevented the authors from detecting the solar signal with confidence, in contrast to their earlier work (Stevens and North 1996), where a “fairly robust solar signal” was found when other deterministic climate signals (such as volcano eruptions and anthropogenic warming) were ignored. Lean and Rind (2008) recently also pointed out that such volcano contamination could affect methods such as Fourier analysis, which is global in time. It should in principle not affect as much the local-in-time methods such as those used by us (Camp and Tung 2007a; Tung and Camp 2008; Tung et al. 2008). In our previous work, we additionally removed two years after major eruptions, when significant aerosol-induced cooling was observed. Nevertheless, it would be reassuring if the solar signal could still be found during periods when the stratosphere was clear of volcano aerosols, or when the period studied is long enough that the time of occurrence of major volcanoes can be taken as random and averaged out when we take the composite means of solar max and solar min and then difference them. A long data record affords us both possibilities. In addition, we will present an analysis using a novel method, which we call the multi-CMD inversion method, to show that volcano and ENSO contaminations are small in our solar results.

Previously there have been a number of important papers in the oceanographic literature dealing with the upper oceans’ response to the radiative forcing from the sun. Of these, the work of White et al. (1997) stood out. They pointed out that since almost 90% of the change in TSI on decadal and interdecadal time scales is at wavelengths that penetrate to the troposphere, it is plausible that direct radiative forcing by the changing solar insolation of the upper ocean can give rise to a solar signal in the SST. Using 92 yr of the Global Ice and Sea Surface Temperature (GISST) data from 1900–91, White et al. (1997) obtained a band-passed decadal signal with an amplitude of 0.08 ± 0.01°C per W m^{−2} of the TSI in the globally averaged (from 40°S to 60°N) SST. The methods used were cross-spectrum and singular spectrum analyses. The peaks of the SST appear to approximately align with the peaks of the TSI except during the beginning of the century and during 1940s and early 1950s; they suspected that the latter discrepancy occurs because of the disruption in the collection of marine data during WWII, which turned out to be the case. Allen (2000) applied the multitaper frequency-domain singular value decomposition method to the Hadley Center global surface temperature record from 1871 to 1994 and found a strong spectral peak in the 10–13-yr period, which he called the quasi-decadal oscillation (QDO). A visual inspection of the time series of this QDO now shows coherence with the 11-yr solar TSI variation, although no correlative study was done by the author. Nevertheless it appears that an 11-yr solar signal in global surface temperature exists in Allen’s filtered data. White and Tourre (2003) similarly found a statistically significant QDO peak in the 93-yr (1900–92) SST spectrum, as did Tourre et al. (2001) earlier in the 92-yr (1900–91) SST spectrum, and commented that the time series of the QDO appear to align with the solar irradiance variation. These methods are all of the Fourier type and may be subject to the volcano contamination mentioned above. We hope our work will be able to directly address the contamination due to volcano and other deterministic signals, such as greenhouse gas warming and ENSO. A new method is introduced in section 9 to separate out these various other contributions.

## 2. Data

Currently the longest homogeneous instrumented record of surface temperature exists in the sea surface temperature, which spans 150 yr from 1854 to 2007, in the form of extended reconstructed sea surface temperature (ERSST), as described in Smith and Reynolds (2003), Smith and Reynolds (2004), and Smith et al. (2008). [NOAA_ERSST_V3 data are provided by the National Oceanic and Atmospheric Administration (NOAA) Office of Oceanic and Atmospheric Research (OAR), Earth System Research Laboratory (ESRL), Physical Sciences Division (PSD), Boulder, Colorado, and are available at http://www.esrl.noaa.gov/psd/.] The dataset was based on the Comprehensive Ocean Atmosphere Dataset (COADS; available online at http://icoads.noaa.gov/Release_1/coads.html; Woodruff et al. 1998). Since 1982, SST is measured directly by satellite with global coverage, in contrast to marine air temperature. The global data were separated into “low frequency” (interdecadal) and “high frequency” (decadal) parts, and missing data were filled in using different methods. Of relevance here is the procedure for the high-frequency interpolation. The global data were expanded in empirical orthogonal teleconnections (EOTs), which are similar to empirical orthogonal functions (EOFs) with the exception noted below. The available ship and buoy data were projected onto these to help calibrate the satellite data. Prior to the availability of satellite data, there were large ocean areas without ship or buoy measurements. The available data were projected onto the leading EOTs deduced from satellite measurements after 1982. The influence by any measurement point is truncated beyond 8000 km and damped beyond 5000 km. These ranges of influence are larger than can be justified but were necessitated by the sparse coverage. The Goddard Institute for Space Studies (GISS) dataset, for example, allows a single measurement to influence other grid points only up to 1200 km based on a correlation analysis of the data points (Hansen et al. 1999). By this means, ocean SST data between 60°S and 60°N appear to be more geographically complete than land surface temperature data, and they greatly influence the global mean temperature used in the Intergovernmental Panel on Climate Change Fourth Assessment Report (IPCC AR4; Solomon et al. 2007). Recently, Thompson et al. (2008) found that the global temperature data used in the IPCC AR4 report are problematic during the Second World War, when British ships were replaced by U.S. ships. The U.S. ships measured SST using engine-water intake, which tended to be warmer than the British method of measuring SST on deck from water drawn up using buckets. The authors argued that this might account for the anomalous warming seen in the global temperature displayed in the AR4 report in the 1940s and the subsequent cooling as British ships resumed measurement in the mid-1940s. This warming and cooling were suggested from time to time by some, perhaps erroneously, to be of solar origin, arguing that they were not expected from greenhouse trends. In our current study, the years 1942–50 are deleted from our record as problematic years not yet adjusted in the data record, according to Thompson et al. (2008). Their removal resolved much of the sensitivity we were encountering with the historical data with respect to the length of record analyzed.

Additionally, because of sparse data, the ERSST data were heavily damped before 1880, but it is claimed that after 1880 the signal strength was more consistent over time. We originally performed our calculation only for the period 1880–2007. Later when we repeated the calculation for the whole period 1854–2007, the results show very little difference in the overlapped period. Hence the full record, encompassing 14 solar cycles, is shown in Fig. 1, although we do not have confidence in the spatial patterns of the response before 1880.

## 3. Composite-mean difference projection method

We use the method of CMD projection of Camp and Tung (2007a). A similar but more sophisticated method is available in the form of linear discriminant analysis (LDA) (Schneider and Held 2001; Tung and Camp 2008; see also Camp and Tung 2007b), but we chose the present simpler method for the greater ease with which others can reproduce our results, and because it is more intuitive. Briefly, this method separates temperature data into two groups, the solar max group and the solar min group. The separation is done objectively according to the TSI, to be discussed below. A global spatial pattern is obtained by composite-mean difference. The original data is then projected onto this CMD spatial pattern, resulting in a time series that may or may not vary in phase with the solar (TSI) time series. The method is successful when the correlation is high. The correlation of these two time series is tested using a Monte Carlo simulation. The unknown atmospheric population distribution is estimated by bootstrap resampling with replacement of the original temperature data, by assigning a year to either a solar max or solar min group randomly while preserving the number of years in each group. The same CMD projection method is applied to this synthetic data to produce a time series. The percentage of the time when this randomly generated time series has a correlation coefficient with the TSI equal to or higher than the observed one in magnitude is noted, and this number is often less than 0.02%. To take into account the inherent autocorrelation of the climate data, the resampling is repeated using a block of *L* years, where the length of *L* is to be determined by the autocorrelation time of the time series. Since *L* is not known a priori, we simply repeat the calculation for *L* = 1, 2, 3, …. , 10, 11, 12, etc., and report the lowest confidence level obtained, which occurs at *L* = 10. There is, however, not much change (less than 0.02%) between *L* = 1 and *L* = 10 for the decadal signal under study. The same holds true when we repeat our previously published results in Camp and Tung (2007a) and Tung et al. (2008), using *L* = 10 instead of *L* = 1. That is, all results previously deemed to be statistically significant at above 95% confidence level remain so using the moving-block resampling method.

Figure 1 shows the global mean (from 60°S to 60°N) and annual mean SST from ERSST described above for the period 1854–2007, along with the annual mean TSI from Lean et al. (2005) and Wang et al. (2005) extended to 2007 and kindly provided to us by J. Lean. It is visually apparent that there exist *non-uniform* trends in both the SST and the TSI. There was a severe cooling of over 0.6 K in the globally averaged SST in a short period of time from 1895 to 1910. Then the SST warmed by an even larger amount of 0.8 K from 1910 to 1945. In 1945 there was the sudden anomalous drop in SST studied by Thompson et al. (2008), followed by the modern global warming of 0.5 K until 2007. This latest warming is usually attributed to the increase in greenhouse gases. The warming from 1910 to 1945 is sometimes attributed to the solar forcing, as the TSI coincidentally also increased during this period (see Fig. 1). The solar max of 1910 was abnormally weak and the solar max of 1955–60 was abnormally strong, and there was a general increasing trend in between. We will show, however, that this trend in TSI during the period 1910–45 was too weak to account for the “observed” warming, which was likely due to bad data.

## 4. To detrend or not to detrend

In the period 1959–2004 previously analyzed by Camp and Tung (2007a), the TSI from Lean et al. (2005) has no trend. In the longer-term record we are analyzing here, the presence of the nonuniform trend, also from Lean et al. (2005), makes some solar max TSI values in an earlier period lower than even the solar min TSI values in the more recent period. Since from physical grounds it is the absolute irradiance that matters, with the higher TSI warming the earth more than the lower TSI does, it is not clear that a trend *should* be removed to center the TSI data. To compound the problem, the magnitude of the trend in TSI is uncertain and is currently under debate; see the IPCC AR4 report (Forster et al. 2007, p. 132). We have decided not to detrend but instead to implement a *pairwise differencing* procedure. We divide the TSI time series into subperiods each containing just one whole solar cycle (with one solar max and one solar min). Since there is very little TSI trend within a decadal period, the solar max (min) years are defined as the years when the TSI is 0.06 W m^{−2} above (below) the mean TSI for that particular short subperiod. [The 0.06 W m^{−2} threshold was introduced by Camp and Tung (2007a) so as not to count years as either solar max or solar min when their TSI variations are within ~10% of the mean peak variation of ±0.6 W m^{−2}.] This grouping/identification is objectively done for each solar cycle period. The CMD is performed on the SST data one solar cycle period at a time by taking the difference between the temperature at solar max years and at solar min years. This difference for each solar cycle period is then averaged over all solar cycles in the longer data record. This method works well even with undetrended data when the secular trends are small. During the last three decades, however, somewhat different results are found when a subperiod is defined as solar max following a solar min versus a solar min following a solar max. This problem is remedied by the procedure of *pairwise differencing with shift*, as described below.

The monotonic positive trend in the surface temperature in the recent decades may be due to forcing agents other than the TSI. An obvious candidate is the increase in greenhouse gases. To remove this contamination, we perform the above-described pairwise differencing with the following modification. A whole solar cycle subperiod is first defined as solar min following solar max. Then we repeat the procedure but by defining a whole solar cycle subperiod as solar max following solar min. This is done by shifting the years comprising a solar cycle forward by half a cycle. The CMD spatial pattern that we will use is obtained by averaging the patterns obtained with these two definitions over this one and a half period. If there is a positive 5–10-yr trend that exists within a solar cycle, it would manifest itself by yielding a higher CMD warming if the solar max follows the solar min than if the solar min follows the solar max. The averaging then eliminates the short-term trend that might be present within a solar cycle, as the positive and negative contributions of the trend to the CMD cancel each other locally (within that one and a half solar cycles). Interdecadal variations are not removed. Previously, in Camp and Tung (2007a), the linear trend that exists in the temperature record of 1959–2004 was removed by linear detrending. This is not feasible in the 150-yr data because no single linear trend exists. Piecewise linear trend removal introduces artificial jumps in temperature, which is undesirable. Our method of pairwise differencing with shift works very well and greatly reduces the sensitivity we have had in our previous trials with trend removal.

## 5. CMD projection

Figure 2 shows the longitude–latitude distribution of warming and cooling obtained by CMD (*pairwise differencing with shift*), as described in the previous section, for the period 1854–2007. The spatial distribution prior to 1880 is probably not as reliable. Therefore we repeated the calculation using the period 1880–2007, shown in Fig. 3. The differences between the two relate mostly to the fact that the amplitudes of the warming and cooling centers are slightly larger in Fig. 3, probably because the data prior to 1880 were heavily damped in the dataset. This spatial CMD pattern is denoted *P*_{1}(*x*). In the CMD projection method, the original SST data are expanded in an EOF expansion as

The orthogonality of the spatial modes is enforced by the definition of the projection coefficients:

When *C*_{1}(*t*) is defined this way, the “solar cycle” mode *P*_{1}(*x*) is orthogonal to sum of all the remaining modes, which theoretically include all other variability and noise. The lower panel in Fig. 2 shows the projected time series *C*_{1}(*t*) in blue. It is the time variation of the solar response in the SST data corresponding to the spatial pattern shown in the top panel. For convenience of presentation, *C*_{1}(*t*) is additionally normalized by the global mean of *P*_{1}(*x*), so the magnitude of *C*_{1}(*t*) is interpretable as the magnitude of the globally averaged SST variation in response to the solar TSI variation.

Looking at the time series of solar cycle response in Fig. 3, we see that the solar max *warms* relative to the solar min in globally averaged SST in the 13 solar cycles examined. There was not a phase reversal in 1920 or during any other period. The amplitude of the global SST response is about 0.1°C per each W m^{−2} [the scale of TSI and *C*_{1}(*t*) are scaled 1 W m^{−2} to 0.1°C to facilitate this comparison]. There are, however, a few cycles in which the amplitude is smaller, and this can usually be attributed to questionable data. When regressed over all cycles (excluding, however, the period 1942–50 mentioned earlier), including periods of remaining bad data, the warming in globally averaged SST (over 60°S and 60°N) is *κ* ~ 0.085°C per W m^{−2} for the period 1880–2007. This amplitude is about the same that found by White et al. (1997) for the period 1900–91. The solar cycle response amplitude found here for the SST is about 60%–70% of that found in the land–ocean average found by Tung et al. (2008) for the in situ data of GISS and HadCRUT3. This finding is consistent with the value of *κ* = 0.12°C per W m^{−2} found for the land–ocean average in those two in situ datasets because warming is usually stronger over continents. The ratio of this ocean average versus global average of land and oceans is even smaller in the recent reanalysis data, also shown in Tung et al. (2008), which included areas poleward of 60°N and 60°S with amplified warming, not included in the in situ data.

There is a severe cooling trend after the eruption of Santa Maria in October 1902 that lasted more than a decade, longer than can be expected from volcano aerosol cooling. Interestingly, this cooling does not project onto the solar response pattern, indicating that this severe cooling may be due to noise or more likely bad data, and is effectively filtered out by our projection method. The decade after WWII produced a solar max response that is smaller than expected from the TSI. The global SST is actually very warm during that solar max (see the black line in Fig. 1), but it does not project onto the solar response spatial pattern. This is an indication that the spatial pattern of the SST during that decade is not consistent, as the mix of British and U.S. ships was changing (Thompson et al. 2008). The WWII years likewise do not project significantly onto the solar response spatial pattern (not shown) if the latter is obtained for the period 1854–2007 with the WWII years excluded, showing that the warming and cooling during that period were not solar related. Nevertheless the erroneous temperature discontinuity is so large that if the WWII years had been included in our calculation of the spatial patterns, they would have contaminated that pattern.

## 6. Statistical tests

The correlation coefficient *ρ* between the temperature response *C*_{1}(*t*) and the TSI is about 0.69 for the period 1880–2007 and 0.65 for the period 1854–2008, both quite high for such a long data record and extremely unlikely to be producible by chance if there were no solar cycle signal in the SST (the null hypothesis). Figure 4 shows the distribution of *ρ* in 10 000 synthetic SST time series generated using the method of bootstrap with replacement, to be described now. The relationship between the TSI and years is not randomized but held fixed as the real values, so the grouping of years into the solar groups remains the same as described in section 3. However, the temperature value for a particular year (say, 1880) is drawn randomly from a year (which could happen to be 1920) in the real SST data record. Afterward that year is returned to the SST record and another year is drawn randomly from this entire SST data to be assigned to 1881 and so on. (The year chosen in a previous step needs to be returned to the pool before another year is chosen; otherwise later draws would not be independent of the early ones. For example, if the years were not returned to the pool and *N* − 1 yr were chosen, the *N*th year would be dependent on the previous *N* − 1 yr.) In this way, the years are populated by SST values. The original association of the temperature with the solar groups is destroyed, but the number of years in each solar group is maintained. The CMD projection method described in section 3 is then applied to this synthetic SST data to generate a time series *C*_{1}(*t*), which is then correlated with the TSI time series to yield a correlation coefficient. Repeating this procedure many times (e.g., 10 000), one can then establish a confidence level to reject the null hypothesis by seeing how many synthetic correlation coefficients are less than the observed value.

To take into account the fact that our temperature data may be serially correlated (Zwiers 1987, 1990), the above bootstrap method is modified using the so-called moving-block bootstrap (Efron and Tibshirani 1993; Lahiri 2003; Leger et al. 1992; Wilks 1997). Blocks of *L* successive data values are resampled instead of resampling individual data values. The value of *L* is defined so that data values of *L* distance or farther away from each other are essentially independent. Generally it is difficulty to theoretically determine an appropriate block length *L* (Leger et al. 1992). However, under the assumption that the original time series is modeled as a first-order autoregressive process, Wilks (2006, chapter 5) has suggested that a good choice of the block length *L* is given by

where *n* is the sample size, *n*′ = *n*(1 − *ρ*_{1})/(1 − *ρ*_{1}) is the approximated effective sample size, and *ρ*_{1} is the lag-1 autocorrelation coefficient. For our problem, the block length *L* calculated using the above formula varies from 1 to about 20 yr depending on the spatial location. Since the temperature data may not follow AR(1) processes very well, the above estimate for *L* may still not be accurate. The method that is adopted here is actually quite simple: we repeat the calculation for each value of *L* and conservatively take the value of *L* that yields the lowest level of statistical confidence. This occurs at 10 yr. We still find that very few of the 10 000 synthetic SST time series achieve a correlation equal to or higher than the observed value. Thus, we have effectively ruled out the null hypothesis that our method can by random chance generate an apparent “signal” highly correlated with the TSI when no real solar signal exists in the data.

One may be suspicious of this high confidence level and question whether it can be caused by the fact that it is helped by the existence of the long-term trend in the observed time series, with the temperature in earlier decades before WWII lower than in the more recent decades after the war, while in the synthetic data there is no consistent trend because of the scrambling of the years. It turns out that unlike the regression coefficient *κ*, the correlation coefficient is not sensitive to the presence of trend in *C*_{1}(*t*). When we remove the trend in *C*_{1}(*t*) before correlating it with TSI, *ρ* is changed only slightly. The results of such a calculation are indicated in Fig. 4.

## 7. Spatial features in ocean basins

We will next discuss the features in Fig. 3, obtained using the better data since 1880. It shows that the response over oceans has both warming and cooling distributed in some characteristic patterns, more so than the warming over continents found in our previous work. The ocean area-averaged temperature is therefore smaller than the local SST anomaly, which ranges from −0.2° to +0.2°C. In the Atlantic Ocean, the tropics are cold south of the equator but warm a little north of it. The northwestern Atlantic is cold. The Indian Ocean is warm. These features are robust. The robust basinwide warming in Indian Ocean—a small ocean basin—may indicate a radiative response to solar forcing, in contrast to the situation in the larger ocean basin of the Pacific, which is capable of fast dynamical responses involving coupled atmosphere and oceans (Meehl and Arblaster 2009; Tourre et al. 2001; White and Tourre 2003).

In the Pacific Ocean, there is a robust warming center located in the northwestern Pacific and cooling off the west coast of the United States. There is generally cooling in tropical eastern Pacific, with the exception of a thin warming strip located at the equatorial Pacific, where the ENSO variance is large. The warming center in the northwestern Pacific is robust, but the warming strip in the eastern Pacific is not (cf. Figs. 2 and 3, and see later figures).

Recently, van Loon et al. (2007) and van Loon and Meehl (2008) studied specifically the spatial pattern in the Pacific during northern winter using the same ERSST data since 1854. They calculated their composite mean difference by taking the difference of the mean of the “solar peak years” (one year per solar cycle) and the climatology, in effect using only 14 degrees of freedom. The climatology was calculated over a different period than that from which the solar peak years were chosen [The period used in the climatology calculation was based on only 30 yr, 1950–79, in van Loon et al. (2007). A different 29 yr, 1968–96, was used in van Loon and Meehl (2008).] Over the equatorial east Pacific, they found a cold event (La Niña)–like condition, which was deemed statistically significant by the Student’s *t* test. Our Monte Carlo test of bootstrap resampling cannot be applied to their methodology because there is only one data point in each solar cycle. There is no time series information on the response for us to test the similarity between the response and the forcing when only one year is used for each solar cycle. The Student’s *t* test they used does not actually test if the signal is solar related; it merely tests if the mean of the solar peak years is significantly different from the mean of the years used in defining the “climatology.” It is in this regard that the subjective choice of the years used in the calculation of climatology affects the result of the Student’s *t* test. Because the period 1968–96, chosen by van Loon and Meehl (2008) for the climatology, is warmer, it yields a larger-amplitude equatorial Pacific SST cold tongue when it is subtracted from the solar peak mean, and therefore it passes the Student’s *t* test. This is their best result, reproduced here in the top panel of Fig. 5. The yellow contour encloses regions of statistical significance at the 95% confidence level, and we see that the cold tongue at the eastern Pacific and a warm pool over the northwestern Pacific are both statistically significant, as discussed in detail by van Loon and Meehl (2008). This result, however, is not robust to the choice of either the so-called solar peak years or of the base period for the calculation of climatology. The middle panel in Fig. 5 is done in the same way as in van Loon and Meehl (2008) except that the peak solar years are chosen objectively according to the peaks in TSI. The spatial pattern is rather different—the La Niña pattern is disrupted—but nevertheless the eastern Pacific is cold and still statistically significant. This changes again when the proper climatology is taken, using the same period (1854–2007) as that from which the solar peak years were chosen. This is the most objective way for the composite difference and the result is shown in the bottom panel of Fig. 5. None of the features in the Pacific is statistically significant by the Student’s *t* test.

The hypothesis that it is the solar peak years that causes the La Niña–like response in the equatorial Pacific (van Loon and Meehl 2008; van Loon et al. 2007) and that one or two years later the response switches to an El Niño–like pattern (Meehl and Arblaster 2009) may still be correct, and it appears to be supported by modeling results as reported in Meehl et al. (2009). The observational support for this hypothesis, however, is not yet available. It is likely that 150 yr of data is not long enough for us to separate out different behaviors in the first versus second year of a solar max.

The question of whether the equatorial Pacific responds to a warmer climate in a La Niña– or an El Niño–like pattern is under debate in the context of global warming. Vecchi et al. (2008) showed that the ERSST data we are using give a long-term trend in the form of an El Niño–like pattern while a different SST dataset, HadISST, gives a La Niña–like pattern. They attributed the difference to the difference between the two datasets in two periods: the 1930s and the 1980s, which corresponded to the periods of greatest change in the “buckets-to-intake” correction of SST measurements previously implemented (i.e., prior to Thompson et al. 2008) and the beginning of SST retrievals using satellites.

## 8. Multidecadal trend

Since our method does not involve detrending of temperature or TSI, there is a secular SST response seen in Fig. 2 to the secular trend in the solar forcing. Generally, the level of SST solar response is consistent with the level of TSI forcing, with periods of high SST associated with periods of high TSI. By regressing *C*_{1}(*t*), using just the solar min years, or just the solar max years, or the entire time series, onto the years to determine the slope of the time series, the amplitude of the global SST trend arising from the solar influence is found to be about 0.004 ± 0.0012°C decade^{−1} for the period of 1854–2007 or 0.009 ± 0.0017°C decade^{−1} for the period 1880–2007. These bracket the solar trend over the last century, 0.007 ± 0.001°C decade^{−1}, reported by Lean and Rind (1998). However, this slope of *C*_{1}(*t*) is not a robust quantity since the actual trend is nonlinear. To set an upper bound on the solar forcing contribution to the warming trend, we give two maximum values, one using only peak solar max years and one for peak solar min years: it warmed by 0.18°C from the solar max of 1909 to the solar max of 2002. During this period the global mean SST warmed by 0.89°C, and so no more than 20% of that may be attributed to solar forcing during this period. A larger warming of 0.21°C is found from the solar min of 1913 to the solar min of 2005. This last number, 0.21°C, is deemed the upper bound in the secular change in SST that can be attributed to solar forcing, first because that is the difference between the lowest and the highest temperatures in the solar min in the entire record, and second because some greenhouse warming residue may arguably remain in *C*_{1}(*t*) during the most recent solar cycle (possibly since the solar min in 2007 is the last half cycle analyzed) despite our best efforts in removing it. During this same period of time, the global mean SST warmed by 0.81°C, and so no more than 26% of it can be attributed to solar forcing. These are upper bounds; the true solar trend is probably lower. These changes in SST associated with the interdecadal changes in solar forcing are quite modest and in no way can account for the observed warming trend in SST during the last century (see Fig. 1). The latter must have been caused by other forcing agents, including anthropogenic ones.

## 9. Volcano, ENSO, and greenhouse warming contamination: A multi-CMD analysis

When there is long enough data, within-group variances caused by volcanoes and ENSO, which are not consistently correlated with the solar cycle, are hopefully greatly reduced by the composite means and by the differencing of the two groups. Nevertheless, how well these variances are removed has always been a concern. In the analysis shown in Figs. 2 and 3, no volcano years were removed before processing, unlike the procedure in Camp and Tung (2007a). The result is not so different from that obtained (not shown) by excluding the volcanic years from the analysis. The time series *C*_{1}(*t*) is highly correlated with the solar index (*ρ* = 0.69) and not correlated with the volcanic aerosol index (Sato et al. 1993); the latter correlation coefficient, *ρ _{AI}* = −0.08, is practically zero. We will show directly below that volcano contamination is indeed very small. Global warming due to increases in greenhouse gases is another important contamination to the solar signal. Nonetheless, the method that we introduced in section 4 to obtain the solar signal reduces this contamination greatly, as we will quantify below. ENSO is a prominent variability in the Pacific Ocean and can affect significantly the SST patterns studied here, more so than the land–ocean patterns studied previously. For the present study, extreme ENSO years, defined as when the winter [December–February (DJF)] mean cold tongue index (CTI) exceeds 1.2°C in magnitude, are excluded in the analysis presented in sections 3 and 4. The resulting

*C*

_{1}(

*t*) has a correlation coefficient with the annual mean CTI index of −0.13 for the period 1880–2007, which is small enough for the ENSO contribution to the derived solar signal to be negligible. To verify that these contaminations are already small in what we have produced, we shall now try to separate out these four deterministic signals and show that our results on the solar cycle response are not changed.

In a typical error analysis, one assumes that the data consist of the signal under study and a remainder, called “noise.” A noise model needs to be constructed; usually either a random white-noise model or a red-noise model is assumed. As pointed out by North and Stevens (1998), neither of these noise models is appropriate because the climate data contain prominent deterministic signals such as ENSO and volcano aerosols, and they need to be taken into account explicitly. We shall assume that our data *D*(*x*, *t*) consist of multiple deterministic signals plus a random noise, in the following form:

where the *p*’s are the *true* (unknown to us) spatial patterns of the climate influences and the *θ*’s represent their time behavior. The subscripts *S*, *E*, *V*, and *A* indicate the solar, ENSO, volcanic, and anthropogenic greenhouse gas increases, respectively; and *R*(*x*, *t*) is the residual noise, assumed to be random. Superficially this assumed form for the data appears to be quite similar to what is assumed in multiple regression methods. However, the least squares multiple regression method minimizes the sum of squares of *R*(*x*, *t*), while our estimate of the true spatial patterns is obtained by assuming that the means of *R*(*x*, *t*) itself are small. Our assumption appears justified in a long data record, where if enough deterministic signals are taken out from *D*(*x*, *t*), the remainder can be assumed to be approximately random with a very small mean.

Here we first perform the CMD procedure four times on the data, each with the two groups selected according to a different forcing agent (thus, the years contributing to the calculation of the CMD may vary for each climate signal), to establish the following linear equations:

where the *P*_{1}’s are the (known) composite-mean differences of the data *D*(*x*, *t*) and *α _{R}*’s are the (unknown) CMD of the noise

*R*(

*x*,

*t*). Here

*α*,

_{S}*α*,

_{E}*α*, and

_{V}*α*are the CMD of the four

_{A}*θ*’s, which we will assume to be the same as those calculated using the prescribed forcing index for each phenomenon. The superscripts

*S*,

*E*,

*V*, and

*A*indicate by which forcing agent the two groups are defined when calculating the CMD. They are omitted when they are the same as the subscript. We let

*p̂*be the

*estimate*of the true spatial pattern

*p*(

*x*) obtained by ignoring the noise CMD

*α*:

_{R}The error of the estimations can be found by computing

where 𝗠^{−1} is the inverse of the matrix in Eq. (5). It can be seen that the error in the estimated spatial patterns is now caused solely by the random noise, and in particular the CMD of the noise, which is small.

Since the variations of different climate forcings are usually not in phase, the nondiagonal elements of the matrix 𝗠 (and thus 𝗠^{−1}) are all expected to be small, as they turn out to be in our case. Equation (6) then implies that there is very little cross-contamination of errors. The CMD of the random noise in the right-hand side of Eq. (6) should be small if the data record is long. However, even in a long record there may not be enough occurrences of volcano eruptions to make the volcano CMD of the noise small. Any such error due to poor volcano sampling will stay as an error in the estimated volcano spatial pattern and not cross-contaminate the solar spatial pattern estimate.

Shown from top to bottom in Fig. 6 are the estimated spatial patterns *α _{S}p̂_{S}*,

*α*,

_{E}p̂_{E}*α*, and

_{V}p̂_{V}*α*obtained for the period 1880–2007. Based on each climate forcing, we pick the years and group them to compute the CMD as follows (the years 1942–50 are always removed beforehand): The solar max (min) years are defined according to the TSI index as having a TSI 0.06 W m

_{A}p̂_{A}^{−2}greater (smaller) than the local mean of a complete solar cycle. The warm (cold) ENSO years are defined as the years when the annual mean CTI is greater (less) than 0.25°C (−0.25°C). The volcano years are 1883–85, 1902–04, 1963–65, 1982–84, and 1991–93, including three years after each major eruption indicated in Fig. 1. The nonvolcanic group contains years when the annual mean aerosol index is no larger than 0.005 optical depths. The two anthropogenic groups comprise the years when the global mean CO

_{2}mixing ratios (Hansen et al. 1998) are 10 ppm above or below the mean of the entire period of data record, 1880–2007.

It is worthwhile to point out that Eq. (4) is derived by applying the simplest composite mean differencing. For example, is the simple difference between the mean temperature of all the solar max years and that of all the solar min years during the whole period. This (not shown) is not a good estimate to the true solar cycle spatial pattern because it contains other deterministic signals such as volcano aerosols and greenhouse gas warming. (Recall that the error in the solar spatial pattern obtained by the simple CMD method is

It is only used as an intermediate step in the calculation; *P*_{1}^{S} is very different from the estimate, , which removes these deterministic contaminations as shown above. The surprising finding is that this estimate obtained via multiple CMD inversion (in the top panel of Fig. 6) is very close to that obtained in section 4 using the single *pairwise differencing with shift* method (shown in Fig. 3). The latter method is effective in removing the secular trend, which is presumably due to the anthropogenic greenhouse gases also removed in the multiple CMD inversion. The length of the record serves to average out the volcanic and ENSO contaminations, yielding very small differences in the spatial patterns calculated by these two very different methods.

## 10. Conclusions

It is often thought that the response to solar cycle is too weak at the surface to be detectable, and that even if a signal is claimed to have been found its statistical significance cannot be established. Using 150 yr of sea surface temperature data from 1854 to 2007 and an objective method, we found a robust signal of warming over solar max and cooling over solar min, with high statistical significance in the time domain. The amplitude of the signal in the SST averaged over the ocean areas between 60°N and 60°S is ~0.085°C of warming for each W m^{−2} of the change in TSI, which is about 70% that found in land–ocean averages (~0.12°C per W m^{−2}) found in the recent in situ data by Tung et al. (2008), as was expected because the response over continents and over the Arctic is known to be larger.

Volcanic eruptions tend to have a significant contribution to the decadal period peak in any spectral analysis; therefore, contamination of the solar cycle signal by volcanic signal has been a longstanding concern. Using 150 yr of data we have now shown that the volcanic contamination is negligible using our method of *pairwise differencing with shift*. This is further confirmed using a new method of multiple-CMD inversion, similar to the multiple regression method, in which the deterministic volcanic signals are separated out.

Our method of projecting the observed data onto a consistent spatial pattern determined by composite-mean difference of the whole period appears to be effective in reducing contamination by short periods of bad data, which tend to have inconsistent spatial patterns. This effect is in contrast to that of the method of (multiple) regression using least squares fit of the time series, which is affected by outliers (which may likely be caused by bad data).

In the method of multiple regressions as applied to solar variation by previous authors, an index of solar forcing as a function of time, often in the form of TSI or sunspot number, needs to be prescribed, and the resulting response is assumed to vary in time in exactly the same way as the imposed index, albeit with the possibility of a lag. Our method of CMD projection depends only on the classification of years into the solar max or solar min group and does not require that we know the detailed variation of the total solar irradiation or its long-term trend. In this way we bypass the controversy concerning the magnitude of the solar forcing trend in these 150 yr. Assuming that multidecadal SST response has the same spatial pattern as the decadal response, we additionally obtain a secular century trend; the latter is consistent with Lean’s reconstruction of solar forcing. Our result shows that less than a quarter of the observed temperature trend can be attributed to solar forcing.

## Acknowledgments

We thank Dr. Charles D. Camp for helpful suggestions on the moving block bootstrap test. We are also grateful to Dr. Jerald Meehl and two other reviewers for very helpful comments. The research is supported by the National Science Foundation, Climate Dynamics Program, under Grant ATM 0808375.

## REFERENCES

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

**,**

## Footnotes

*Corresponding author address:* Ka-Kit Tung, Dept. of Applied Mathematics, University of Washington, Box 352420, Seattle, WA 98195. Email: tung@amath.washington.edu