Optimal Fingerprinting with Estimating Equations

Sai Ma aDepartment of Statistics, University of Connecticut, Storrs, Connecticut

Search for other papers by Sai Ma in
Current site
Google Scholar
PubMed
Close
,
Tianying Wang bDepartment of Statistics, Colorado State University, Fort Collins, Colorado

Search for other papers by Tianying Wang in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0002-2826-5364
,
Jun Yan aDepartment of Statistics, University of Connecticut, Storrs, Connecticut

Search for other papers by Jun Yan in
Current site
Google Scholar
PubMed
Close
https://orcid.org/0000-0003-4401-7296
, and
Xuebin Zhang cClimate Research Division, Environment and Climate Change Canada, Toronto, Ontario, Canada

Search for other papers by Xuebin Zhang in
Current site
Google Scholar
PubMed
Close
Open access

We are aware of a technical issue preventing figures and tables from showing in some newly published articles in the full-text HTML view.
While we are resolving the problem, please use the online PDF version of these articles to view figures and tables.

Abstract

Climate change detection and attribution have played a central role in establishing the influence of human activities on climate. Optimal fingerprinting, a linear regression with errors in variables (EIVs), has been widely used in detection and attribution analyses of climate change. The method regresses observed climate variables on the expected climate responses to the external forcings, which are measured with EIVs. The reliability of the method depends critically on proper point and interval estimations of the regression coefficients. The confidence intervals constructed from the prevailing method, total least squares (TLS), have been reported to be too narrow to match their nominal confidence levels. We propose a novel framework to estimate the regression coefficients based on an efficient, bias-corrected estimating equations approach. The confidence intervals are constructed with a pseudo residual bootstrap variance estimator that takes advantage of the available control runs. Our regression coefficient estimator is unbiased, with a smaller variance than the TLS estimator. Our estimation of the sampling variability of the estimator has a low bias compared to that from TLS, which is substantially negatively biased. The resulting confidence intervals for the regression coefficients have coverage rates close to the nominal level, which ensures valid inferences in detection and attribution analyses. In applications to the annual mean near-surface air temperature at the global, continental, and subcontinental scales during 1951–2020, the proposed method led to shorter confidence intervals than those based on TLS in most of the analyses.

Significance Statement

Optimal fingerprinting is an important statistical tool for estimating human influences on the climate and for quantifying the associated uncertainty. Nonetheless, the estimators from the prevailing practice are not as optimal as believed, and their uncertainties are underestimated, both owing to the unreliable estimation of the optimal weight matrix that is critical to the method. Here we propose an estimation method based on the theory of estimating equations; to assess the uncertainty of the resulting estimator, we propose a pseudo bootstrap procedure. Through extensive numerical studies commonly used in statistical investigations, we demonstrate that the new estimator has a smaller mean-square error, and its uncertainty is estimated much closer to the true uncertainty than the prevailing total least squares method.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Tianying Wang, tianying.wang@colostate.edu

Abstract

Climate change detection and attribution have played a central role in establishing the influence of human activities on climate. Optimal fingerprinting, a linear regression with errors in variables (EIVs), has been widely used in detection and attribution analyses of climate change. The method regresses observed climate variables on the expected climate responses to the external forcings, which are measured with EIVs. The reliability of the method depends critically on proper point and interval estimations of the regression coefficients. The confidence intervals constructed from the prevailing method, total least squares (TLS), have been reported to be too narrow to match their nominal confidence levels. We propose a novel framework to estimate the regression coefficients based on an efficient, bias-corrected estimating equations approach. The confidence intervals are constructed with a pseudo residual bootstrap variance estimator that takes advantage of the available control runs. Our regression coefficient estimator is unbiased, with a smaller variance than the TLS estimator. Our estimation of the sampling variability of the estimator has a low bias compared to that from TLS, which is substantially negatively biased. The resulting confidence intervals for the regression coefficients have coverage rates close to the nominal level, which ensures valid inferences in detection and attribution analyses. In applications to the annual mean near-surface air temperature at the global, continental, and subcontinental scales during 1951–2020, the proposed method led to shorter confidence intervals than those based on TLS in most of the analyses.

Significance Statement

Optimal fingerprinting is an important statistical tool for estimating human influences on the climate and for quantifying the associated uncertainty. Nonetheless, the estimators from the prevailing practice are not as optimal as believed, and their uncertainties are underestimated, both owing to the unreliable estimation of the optimal weight matrix that is critical to the method. Here we propose an estimation method based on the theory of estimating equations; to assess the uncertainty of the resulting estimator, we propose a pseudo bootstrap procedure. Through extensive numerical studies commonly used in statistical investigations, we demonstrate that the new estimator has a smaller mean-square error, and its uncertainty is estimated much closer to the true uncertainty than the prevailing total least squares method.

© 2023 American Meteorological Society. This published article is licensed under the terms of the default AMS reuse license. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses).

Corresponding author: Tianying Wang, tianying.wang@colostate.edu

1. Introduction

The successive assessments of the Intergovernmental Panel on Climate Change (IPCC) have established that human influence has resulted in global warming, mainly through the emission of greenhouse gases (Hegerl et al. 2007; Bindoff et al. 2013; Eyring et al. 2021). Climate change detection and attribution provided the critical evidence leading to the IPCC conclusions. Optimal fingerprinting (OF), a multiple linear regression model, is the most widely used method for the detection and attribution of climate change. It regresses the observed climate variable of interest on the fingerprints, the expected responses of the climate system to external forcings (Hegerl et al. 1996; Allen and Tett 1999; Allen and Stott 2003). The regression coefficients are called scaling factors. Their point estimates scale the fingerprints to best match the observed climate change. If the confidence interval (interval estimate in statistics) of a scaling factor is significantly above 0, then the effect of the corresponding external forcing is said to be “detected” in the observed data. If, in addition, the confidence interval covers 1, then this is necessary (not sufficient) evidence that the observed changes can be “attributed” to that external forcing.

Both point and interval estimations of the scaling factors are the center of the statistical inference in detection and attribution analyses. The point estimate of a scaling factor reflects how well the model-simulated response has properly estimated the magnitude of the observed changes. A good point estimator should be unbiased, with a variance as small as possible. The interval estimate is important because it provides a quantification of the estimation. Another aspect that has not been widely focused on in the climate literature is the so-called coverage rate of a confidence interval. Often, the statistical meaning of a confidence interval is misinterpreted. In (frequentist) statistics, a 90% confidence interval for a target does not mean the target will be within the confidence interval with a 90% probability. Rather, it means that if the estimation is repeated many times, there is a 90% chance that the target is covered by the confidence interval. A proper confidence interval should have a coverage rate, the percentage of times that it covers the target in repeated estimations, the same as its nominal level.

The linear regression in the OF setting has two distinguishing challenges compared to the standard setting. First, the predictors or the fingerprints of the external forcings are not observed but estimated from climate model simulations, typically as multimodel ensemble averages. Because individual simulations contain natural variations of the climate, averaging will not remove uncertainty completely. That is, the average contains noises, or errors in the predictors, leading to the so-called errors-in-variables (EIV) issue, also known as measurement errors in statistics (Carroll et al. 2006). Different climate models may produce different climate responses, and the model structural differences can be treated (Huntingford et al. 2006), but this is not considered here. If ignored, EIVs may yield a severely biased estimator of the scaling factors. Under the assumption that the natural climate variability in individual model simulations is the same as that in the observations, the errors in the estimated fingerprints have the same covariance structure as the internal climate variability Σ, but the magnitudes are different depending on the number of simulations being used in the averaging. Allen and Stott (2003) for the first time addressed the EIV issue for OF with total least squares (TLS), which remains commonly used.

The second challenge is that the response variable of the OF (e.g., the observed climate variable) is spatially and temporally dependent. As a result, the covariance matrix Σ of the regression error vector is needed to prewhiten the data. Nonetheless, Σ is not known and cannot be estimated from the data, as there is only one observation per site and time point. In practice, Σ is estimated using climate model simulations under the assumption, again, that model-simulated natural variability properly represents real-world natural climate variability. As Σ is of high dimension, available model simulations may not be sufficient to provide a reliable estimation. Methods have been proposed to improve the estimation, including the use of a regularized estimator of Σ, which ensures its positive definiteness (Ribes et al. 2013). Confidence intervals of the scaling factors can be constructed based on the normal approximation of the estimator (Ribes et al. 2013; DelSole et al. 2019; Li et al. 2021) or bootstrap (Pešta 2013; DelSole et al. 2019).

The impact of using an estimated Σ, especially when it is based on relatively small samples, has not been studied until recently. The resulting scaling factor estimator is no longer optimal in root-mean-square error (RMSE) (Li et al. 2023). The inverse of Σ acts as a weight, and it is possible for other weights to yield a better estimator of the scaling factors in terms of RMSE when Σ is estimated with a high level of uncertainty. Further, the resulting confidence intervals do not account for the uncertainty in the estimated Σ to provide enough coverage for the scaling factors. To reduce the effect of using an estimated Σ on the uncertainty of the interval estimate, a common practice is to produce two separate estimates for Σ, one for prewhitening and the other for inferences (Hegerl et al. 1996; Allen and Stott 2003). This approach has been reported to not give confidence intervals with sufficient coverage (Li et al. 2021). Hannart (2016) proposed an integrated OF method, where the unobserved measurement errors in predictors and the unknown Σ, under an inverse Wishart prior, are both integrated out of the likelihood with a closed form. However, the prior used in the study was too informative to be practical, and it is not clear how to properly specify the parameters of this prior distribution. Li et al. (2021) proposed to fix the undercoverage rate issue by using a parametric bootstrap calibration method that enlarges the confidence intervals such that their coverage rates match their nominal levels. The method may not work well, however, when the sample size of data for estimating Σ is more limited.

To tackle both the point estimation and the interval estimation tasks in the OF regression, we propose a novel framework based on estimating equations (EEs). The EE method is a widely used estimation technique in statistics (e.g., Godambe 1991; Heyde 2008). The method of least squares and the method of maximum likelihood frequently used in the climate literature are special cases of the EE method. For interested readers, we provide a brief tutorial of EEs in appendix A. For linear regression with EIVs, the bias-corrected EE method is a standard approach for regression coefficient estimation (e.g., Carroll et al. 2006). The extra challenge in OF is to appropriately account for the spatiotemporal dependence in both point and interval estimations. As will be shown in numerical studies, our point estimators are unbiased with smaller RMSEs, and our confidence intervals provide coverage rates close to the nominal level.

The rest of the paper is organized as follows. In section 2, we propose an EE method for estimating the scaling factors in OF and a pseudo bootstrap method to estimate the variance of the estimator, which leads to confidence intervals with desired coverage rates. In section 3, we report a simulation study that shows the competitiveness of our method in comparison with existing approaches in both point and interval estimations. The methods are applied in section 4 to the detection and attribution analyses of the annual mean temperature of 1951–2020 at the continental and subcontinental scales. A discussion concludes in section 5. To improve readability, we relegate technical details, including the basics of EEs, bias correction, constructing Σ with a block Toeplitz structure, and data description, to the appendixes.

2. Methods

In the following, we will start with the basic concepts, including the main ingredients and assumptions involved in OF. Then, we present our estimation for the scaling factors and construct their confidence intervals. We will also describe the diagnostics of our statistical model.

a. Statistical model

Three ingredients are used in typical detection and attribution analyses.

  1. 1)Observational data. These are a dataset of climate variability that potentially contains climate change signals to be detected. This can be, for example, a spatial map of long-term temperature trends. It can also be a time series of the annual mean temperature over the globe or over a region. But most often, it consists of the spatial and temporal evolutions of a climate variable that may enable detecting the effects of different external forcings separately.
  2. 2)Signal(s) or expected climate response(s) to one or more external forcings. Since they are not known, they are often estimated based on the responses simulated by climate models under different external forcings, such as anthropogenic (ANT) forcing, external natural (NAT) forcing, or combined ANT and NAT (ALL) forcing.
  3. 3)Information about natural internal climate variability or noise. This is typically based on climate model simulations under the “control” condition without external forcing. The residual of individual ensemble members after the removal of model-simulated responses can also be used to supplement the control simulation (in this paper, we will not make the distinction and refer to them as control simulations).

The three parts will be further explained as we introduce the notations next.

We consider a general setting with T time periods and S spatial regions (referred to sites below). For ease of notation, we assume that there are no missing data, but the method still works if there are missing values, as we will show in section 2b. Let Yts be the observation of the climate variable, covering time t = 1, …, T and space s = 1, …, S. Indexing the time and space separately facilitates different treatments of the temporal and spatial dependences in section 2b. Suppose that one is interested in separately detecting signals from J external forcings. Let Xtsj be the true (unobserved) fingerprint of the jth external forcing, j = 1, …, J. Let mj be the number of simulations in the ensemble whose average X˜tsj is used to estimate Xtsj, j = 1, …, J. With each time period treated as a map or a cluster, define for cluster t: Yt=(Yt1,,YtS)T, Xtj=(Xt1j,,XtSj)T, and Xt = (Xt1, …, XtJ).

The OF framework assumes that the responses of the climate system to different forcings are additive. It links the observational climate variables to the signals by linear regression model (e.g., Allen and Stott 2003)
Yt=Xtβ+ϵt,t=1,,T,
and
X˜tj=Xtj+νtj,j=1,,J,t=1,,T,
where β=(β1,,βJ)T is a J-dimensional unknown scaling factor, ϵt is an S-dimensional regression error, and νtj are S-dimensional noises (measurement errors in statistics) in estimating true signal Xtj with X˜tj. The regression error ϵt and measurement errors νtj have mean zero and covariance matrices Σt and Ωtj, respectively. Let ϵ=(ϵ1T,,ϵTT)T and νj=(ν1jT,,νTjT)T, j = 1, …, J. Let Σ and Ωj be the TS × TS covariance matrices of ϵ and νj, respectively. With each block representing one time point, the diagonal blocks of Σ and Ωj are, respectively, Σt and Ωtj.

1) Assumption 1

The noises in the estimated signals {νj: j = 1, …, J} are mutually independent and are independent of ϵ.

Assumption 1 implies that the measurement errors νj are independent across ensemble members and between observations and model simulations. Note that this assumption does not assume independence across different time points within each ensemble member. This assumption is valid because simulations for individual signals are conducted separately, and they are also independent of the evolution of the natural climate.

The internal natural variability of the climate system is represented by Σ, which is unknown but critical in making inferences about β. A common strategy is to estimate Σ from control runs of climate model simulations, which are assumed to reflect the pattern of internal climate variability. Let {ϵ(1), …, ϵ(L)}, each with dimension ST, be independent control simulations with sample size L. Let Ψ be the variance matrix of ϵ(l),l=1,,L.

2) Assumption 2

The internal natural variability simulated by the climate models has the same temporal and spatial structures as those of the observations, though the magnitudes may differ. That is, for a scale parameter a > 0, Ωj = aΣ/mj, j = 1, …, J, and Ψ = aΣ.

Assumption 2 implies that internal climate variability is not affected by external forcing. Although there are examples, such as Arctic sea ice extent, where internal variability may change regionally under certain external forcing (Bonan et al. 2021; Swart et al. 2015), this assumption is reasonable for many OF settings over the period for which we have historical data. Typical OF applications are usually implemented assuming a = 1, which is checked through the residual consistency test (Allen and Tett 1999). Assumption 2 makes our method less restricted and adaptable to more general cases. It is possible to relax this assumption even more by assuming different models simulate different magnitudes of variability. As this is not the main focus of the paper, we will keep our statistical model simple by assuming a unified a and provide a test for statistical model diagnostics, as detailed later.

Estimating Σ with control simulations is challenging, as L could be much smaller than ST, which motivated the regularized OF (Ribes et al. 2013). Imposing some structures on Σ helps to improve the estimation if the structures are reasonable.

3) Assumption 3

The natural internal climate variability does not change over time, that is, it is temporally stationary.

While a typical OF analysis does not explicitly make this assumption, the estimation of Σ from control simulations in practice does not consider temporal changes. Additionally, there is no evidence indicating covariance change on the time and space scales in a typical OF analysis. This explicit assumption greatly reduces the number of parameters in Σ. It implies that Σ has a block Toeplitz structure with the diagonal blocks Σt to be the same for all t and that the covariance of t,t of ϵt and ϵt depends only on the time lag |tt′|. This assumption does not, however, state that the off-diagonal blocks are zero. In general, we expect that t,t decreases as |tt′| increases. As will be clear next, we discard the off-diagonal blocks of Σ in exchange for a much more reliable weight construction in our point estimation; the uncertainty caused by the unspecified off-diagonal blocks in the estimation is accounted for by a pseudo bootstrap procedure.

b. Estimating the scaling factor: Point estimator

If Xt were known, β can be easily estimated using the EE method; see appendix A for a tutorial. The basic principle of the EE method is to construct a set of equations based on the sample data and the unknown parameters, which are called EEs. The estimator is the solution EEs. Consider a simpler situation where Σt are known, and let us discard the temporal dependence for now. An EE can be obtained by weighted least squares.
1Tt=1TXtTΣt1(YtXtβ)=0.
Solving this equation for β gives a closed-form estimator of β. As the expectation of the left side of Eq. (3) is zero, it is an unbiased EE, and the resulting estimator is a consistent estimator according to the EE theories (e.g., Godambe 1991; Heyde 2008). That is, as sample size T increases, the estimator converges in probability to the true parameter value of β.
Now, only X˜t instead of Xt are known. The left side of Eq. (3), with Xt substituted with X˜t, does not have an expectation of zero anymore, resulting in a biased estimating equation. With assumptions 1–3, it can be shown that (appendix B)
E{X˜tTΣt1(YtX˜tβ)}=aSdiag(1m1,,1mJ)β.
Thus, a bias-corrected estimating equation for β is
1Tt=1TGt(β;Σt)=0,
where
Gt(β;Σt)=X˜tTΣt1(YtX˜tβ)+aSdiag(1m1,,1mJ)β.
Equation (5) is not implementable, because Σt is unknown. Under assumption 2, Ψt = aΣt, where Ψt are the diagonal blocks of Ψ. So we can estimate the pattern of Σt using the control simulations {ϵ(1), …, ϵ(L)}. Under assumption 3, Σt are identical for all t ∈ {1, …, T}. Therefore, the L replicates at all T time periods can be pooled to form a sample of size LT to estimate this S × S covariance matrix. This is in contrast to typical OF analyses with TLS, where a sample of size L is used to estimate Σ of dimension TS × TS. Let Σ^+ and Ψ^+ be the pooled estimates of Σt and Ψt, respectively. Substituting Σ^+=a1Ψ^+ into Σt in Eq. (5), an implementable EE for β is
1Tt=1TGt(β;Σ^+)=1Tt=1TGt(β;a1Ψ^+)=0.
Solving the EE [Eq. (6)] gives a closed-form estimator
β^T=1TATt=1TX˜tΨ^+1Yt,
where
AT={1Tt=1TX˜tTΨ^+1X˜tSdiag(1m1,,1mJ)}1.
Interestingly, the unknown scale a cancels out, and it is not needed for the estimator [Eq. (7)].

Recall the EE [Eq. (6)] discards the temporal dependence, as it only uses the diagonal blocks of Σ, which means we are losing efficiency. Estimating all T blocks of the Toeplitz structure of Σ is possible, but blocks further away from the main diagonals are estimated less reliably, because the number of available pairs drops. The fully estimated Σ may still not be of full rank, a challenge similarly encountered in typical OF analyses. Discarding the off-diagonal blocks in estimation has the potential to lose efficiency, but the diagonal blocks are estimated more reliably, which leads to a more reliable weight. The gain from a more reliable weight may well exceed the loss from discarding the temporal dependence, especially when the temporal dependence is not strong, as shown in our numerical studies. Indeed, in typical OF applications where interannual variability, such as the effect of ENSO, is averaged out, the temporal dependence is weak, and the efficiency loss can be minimal.

The estimator β^T in Eq. (7) has nice properties. For a large T, it is approximately unbiased (consistent) as long as the expectation in Eq. (4) holds; no distributional assumptions have been made beyond the expectation. The efficiency loss from discarding the temporal dependence in point estimation is offset by the potentially big gain from a more reliable pooled estimator of the S × S weight matrix Σ^+1 or Ω^+1. Incorporating the temporal dependence like the prevailing TLS method does through a much bigger TS × TS weight matrix has the potential to achieve higher efficiency in point estimation, but this potential cannot be realized because of the large uncertainty in estimating a much higher-dimensional weight matrix. Further, in practice, missing data do not affect the proposed estimator. We simply use the available observations to construct the contribution of each cluster (time point) to the EEs, with the rows and columns corresponding to the missing observations removed from Σ^+.

Although a is not needed in the point estimation of β, it is needed in constructing the confidence intervals of β, and an estimator of a provides a diagnosis of the restriction a = 1 in typical OF analyses. We give details of how to obtain an estimator a^T of a with an EE in appendix C.

c. Confidence intervals

Confidence intervals for β can be constructed based on the normal approximation of the estimator β^T. By the theory of EEs, as T → ∞,
T(β^Tβ)N(0,ABAT),
where A = limT→∞AT, B = limT→∞BT, and BT=cov{T1/2t=1TGt(β;Σ^+)}. Detailed derivation can be found in appendix D. The two components A and B in the variance can be consistently estimated by their sample counterparts, with β replaced by β^T. In particular, A can be estimated by AT easily, but estimating B is more challenging, because there may be unspecified temporal dependence, which, if ignored, would lead to confidence intervals with undercoverage issues.

We propose a pseudo residual bootstrap approach to fully utilize the control runs in estimating B. Define the block residual rt(β)=YtX˜tβ, t = 1, … , T. Let r(β)=[r1T(β),,rTT(β)]T. Note that r(β) has an expectation of zero and covariance matrix δ2(β)Σ with δ(β)=(1+aj=1Jβj2/mj)1/2. In a standard residual block bootstrap procedure, one would resample the residuals r(β^T) in blocks to preserve the spatial and temporal dependences and add the bootstrap copies of the residuals to X˜tβ^T to form bootstrap copies of Yt, t = 1, …, T. The performance of the standard procedure here, however, is questionable, because it requires a large sample size T, whereas in a real-world application where multiyear averages are considered, T can be quite small. A valid bootstrap procedure has to preserve the spatial and temporal dependences in the data. We resort to control runs to meet this requirement, because they are assumed to have a similar covariance structure as the climate system (assumption 2).

The control runs need to be appropriately scaled to form bootstrap residuals with spatial and temporal dependences preserved. Recall that Ψ^=a^TΣ^. With control runs {ϵ(1), … , ϵ(L)}, we use
{δ(β^T)ϵ(1)a^T,,δ(β^T)ϵ(L)a^T}
in place of bootstrapped residuals, which by construction preserve the spatial and temporal dependence perfectly. This leads to L pseudo bootstrap samples. Each pseudo bootstrap sample gives a copy of T1/2t=1TGt(β;Σ^+). We estimate B by the sample covariance matrix B^T of these L copies. Note that the scale of a^T plays an important role in determining the residual magnitudes and, hence, the width of the confidence intervals. This procedure is computationally efficient, as it only requires evaluating the EE [Eq. (6)] with the L copies from control runs, neither bootstrapping nor resolving it. The procedure is itself general and could be applied to estimate the variance of other estimators, including the TLS estimator.

Note that one can also incorporate the runs for the J external forcings, after centering, through the aforementioned scaling procedure to increase the number of bootstrapped copies. In our simulations and data analysis, we used L + m1 + m2 − 2 runs in total, where m1 and m2 are the numbers of runs for the ANT and NAT forcings, respectively. The subtraction of 2 is because both the ANT and NAT forcings were centered in the analysis.

With ATB^TATT as an estimator for ABAT, we are ready to construct confidence intervals for β based on the normal approximation of β^. For each βj, a 100(1 − α)% confidence interval is
(β^T,j1Tzα/2σ^β^Tj,β^T,j+1Tzα/2σ^β^Tj),j=1,,J,
where β^T,j is the jth component of β^T, zα/2 is the upper α/2 quantile of the standard normal distribution, and σ^β^T,j is the jth diagonal element of ATB^TATT. As shown in the simulation study in the next section, the coverage rates of the confidence intervals constructed this way are close to the nominal level.
To assess the quality of a confidence interval, we use the interval score of Gneiting and Raftery (2007) as a measure that combines both the interval length and its deviation from the target. Consider a symmetric 100(1 − α)% confidence interval (L, U) for a target τ. The interval score with level α is
ISα=(UL)+2α(Lτ)1(L>τ)+2α(τU)1(U<τ),
where 1() is an indicator function. The optimal score is achieved when the target τ is covered by (L, U), with the interval length being minimal. This score is used to compare different confidence intervals in the simulation study.

d. Diagnostics of the statistical model

The first diagnosis is to test the null hypothesis H01: a = 1, which is assumed in typical OF analyses. The asymptotic normal distribution of a^T can be used to construct a Z statistic Z=(a^T1)/var^1/2(a^T), which follows a standard normal distribution under the null asymptotically. The variance of a^T can be estimated using the same pseudo parametric bootstrap procedure for β^T.

Alternatively, H01 can be tested without an estimation of a. Consider prewhitened residuals rt*=a1/2δ1(β^T)Ψ^+1/2rt, t = 1, …, T, fixing a = 1. Define r*=(r1*,,rT*)T. Under H01, r* should have a variance of 1, although there may be some temporal dependence. So we consider H02 that the prewhitened residual r* has a variance of 1 and use the sample variance S2(r*) of r* as the testing statistic for H02. The null distribution of S2(r*) depends on the unspecified temporal dependence. Denote prewhitened control runs ϵt(l)*=Ψ^+1/2ϵt(l) and ϵ(l)*=[ϵ1(l),,ϵT(l)]T. The variance of ϵ(l)*, l = 1, …, L should also be 1, and its temporal dependence should be the same as that of r*. Therefore, the null distribution of S2(r*) can be approximated by the empirical distribution F^() of the sample variances S2(ϵ(l)*), l = 1, … , L of ϵ(l)*. When L is small, to increase the accuracy, we could also generate some block bootstrapped versions of the prewhitened control runs and pool them with the original prewhitened control runs to approximate the null distribution of the testing statistic. The approximate p value is 2min{F^[var(r*)],1F^[var(r*)]}. The same procedure could be applied with a evaluated at a^T as a diagnostic test to check H02 after allowing a ≠ 1.

The effect of prewhitening with Ψ^+ also needs to be tested. The EE method is based on the assumption that the regression error is temporally stationary and that after prewhitening, the errors at each time become uncorrelated. So we test the composite null hypotheses H03: there is no spatial autocorrelation in the prewhitened residual rt* for all t = 1, … , T. For each time t, we use Moran’s I statistic to test no zero spatial correlation at that t (Moran 1950; Li et al. 2007). The p value pt at each time t can be calculated by function Moran.I() from the R package ape (Paradis et al. 2004). To combine the T individual p values to form an overall diagnosis for the prewhitening effect, we use the recently developed Cauchy combination rule to define a combined statistic (Liu and Xie 2020)
CT=1Tt=1Ttan[(0.5pt)π].
The tail of the null distribution of CT is well approximated by the standard Cauchy distribution under arbitrary dependency structures among the individual p values. The Cauchy combination test suits our situation well to obtain a single decision about the adequacy of the prewhitening with Ψ^+. Of note, the prewhitening is only needed for model diagnostics but not for point or interval estimation.

3. Simulation study

a. Simulation settings

To evaluate the performance of the proposed estimator in realistic settings, we conducted simulation studies mimicking detection and attribution analyses of changes in the mean temperature. Both global and regional scales were considered. The global setting was based on S = 54 grid boxes of size 40° × 30°. The regional setting was based on eastern North America (ENA), with S = 21 grid boxes of size 5° × 5°. For ease of comparison with the results in Li et al. (2021), 5-yr mean temperatures were considered over the time period 1951–2010. Mimicking the application in the next section, the observed data and simulations under the external forcings were anomalies relative to their 30-yr average over 1961–90. Due to this centering, the period of 1961–65 was then removed from the analysis, resulting in T = 11 clusters.

For each setting, we first set the true signals X and the true covariance Σ of the TS-dimensional regression error ϵ. Two signals were considered, X1 for the ANT forcing and X2 for the NAT forcing. Their true values were set to be the average of 35 and 46 simulations under the ANT and NAT forcings, respectively, from phase 5 of the Coupled Model Intercomparison Project (CMIP5). The true Σ was set to be the estimate based on 223 control runs from CMIP5 with a block Toeplitz structure imposed to satisfy the temporal stationarity assumption. See appendix E for details on strategies to make it positive definite and make the temporal correlation decay as time lag increases.

With X1, X2, and Σ set, we can now generate the observed data and control runs. For j ∈ {1, 2}, the estimated signal X˜j=(Xj1T,,XjTT)T was generated from a multivariate normal distribution N(Xj, aΣ/mj), with m1 = m2 = m ∈ {20, 40} and a ∈ {0.5, 1}. The regression error ϵ and the control runs [ϵ(1), … , ϵ(L)] were independently generated from a multivariate normal distribution N(0, Σ), with L ∈ {50, 100, 200}. Here L = 50 is relatively common in OF studies, and L = 200 is possible but not easily obtained unless runs from different climate models are pooled. The observed temperature Y=(Y1T,,YTT)T was generated from the model [Eq. (1)] with β=(β1,β2)T=(1,1)T.

For each combination of m and L, we ran 1000 replicates to evaluate the proposed method compared to two TLS-type methods. The first TLS method, denoted by TLS-TS, is the prevailing two-sample approach where the control runs were split into two samples, each giving an estimate of Σ. The first one was used to prewhiten the data in the point estimation of β; the second one was used to estimate the variance of the estimator and construct confidence intervals based on the normal approximation (DelSole et al. 2019; Li et al. 2021). Unlike the simulation-based approach of Allen and Stott (2003), no open intervals were expected. The second TLS method, denoted by TLS-PBC, uses all the control runs to estimate Σ and calibrates the confidence intervals by parametric bootstrap (Li et al. 2021) with 500 bootstrap replicates. The whole study was run on the high-performance cluster (HPC) of the University of Connecticut.

b. Results

Here we only discuss the simulation results from the setting where the true Σ is the block Toeplitz matrix, the true a is 1, and the regression error follows a multivariate normal distribution. More simulation results are provided in the supplement. Table 1 summarizes the bias and the RMSE of the three estimators in all the simulation configurations. All three point estimators appear unbiased in all settings. At the global level, the EE estimator has the smallest RMSE among the three estimators, while the TLS-TS estimator has the largest RMSE for most settings. The difference in RMSE between the EE and TLS-PBC increases as the sample size of control runs decreases or the accuracy in estimating Σ decreases. For L = 50, the EE has a much smaller RMSE for both scaling factors than TLS-PBC. The RMSE of the EE almost remains the same for different L, which means a small sample size of control runs is sufficient for an EE. In contrast, TLS-PBC needs more control runs to achieve a smaller RMSE. For the ANT forcing, the RMSE of the EE is always smaller than TLS-PBC for all m and L. For the NAT forcing, the EE is better than TLS when L = 50 or 100. In the extreme case where L = 200 and m = 20, TLS-PBC is slightly better than the EE, but this number of L is not easy to obtain. At the regional scale, where the signal-to-noise ratio is lower, the RMSE of the EE is very close to or smaller than those of TLS-TS and TLS-PBC except for the NAT forcing when m = 20. In summary, all three methods give unbiased estimators, but the EE has the smallest RMSE for most settings, especially for the ANT scaling factor.

Table 1.

Summaries of the bias and RMSE from three methods in the simulation settings.

Table 1.

Table 2 summarizes the average widths, the empirical coverage rates, and the average interval score (Gneiting and Raftery 2007) of the 90% confidence intervals constructed from the three methods across all the simulation settings. At both global and regional levels, EE intervals have coverage rates very close to the nominal level in almost all settings. For all three levels of L, even at L = 50, their coverage rates are close to the nominal level 90%, with a minimum of 87% for both ANT and NAT scaling factors, regardless of the noise level of the measurement error controlled by m. TLS-TS intervals have the lowest coverage, as reported in Li et al. (2021). Their coverage rate improves as L increases but remains unsatisfying and lower than 80% for both ANT and NAT even when L = 200. TLS-PBC requires L = 100 to reach the nominal level of coverage rate for the ANT scaling factor and L = 200 for the NAT scaling factor, whereas it becomes conservative for the ANT scaling factor when L = 200, especially at the regional level. In addition, EE intervals are often narrower than or comparable to TLS-PBC intervals when both methods provide desired coverage rates. The superiority of EE intervals regarding the widths and coverage rates is reflected by the lower interval scores. The undercoverage of TLS-TS intervals and the conservativeness of TLS-PBC intervals lead to larger interval scores. At the regional level, where the signal-to-noise ratio is lower, all intervals become wider than at the global level. EE intervals still give the proper coverage rate with a sample size of control runs as small as L = 50. To sum up, the proposed EE intervals provide valid confidence intervals with the desired level of coverage rate and a narrower width in most cases at a much lower computational cost than TLS-PBC intervals.

Table 2.

Summaries of the average length, empirical coverage percentages (CPs), and average interval score of the 90% confidence intervals constructed from three methods in the simulation settings.

Table 2.

c. Additional simulations

To investigate the effect of differences in the magnitude of climate variability between observations and model simulations and the effect of non-Gaussian residuals, we also conducted additional simulations with a ∈ {0.5, 1}, ϵ following a multivariate normal or t distribution with a degree of freedom of 15 and the true Σ set as a block Toeplitz matrix or the regularized linear shrinkage estimator from control runs (see section S2 in the online supplemental material). For the point estimators, the EE method is still almost unbiased and has the smallest RMSE across all the settings, while the two TLS methods showed a large bias when a = 0.5, especially for the NAT forcing. For the confidence interval, the EE method maintains a close-to-nominal coverage rate, while TLS-TS has a much lower coverage rate. TLS-PBC also could not reach the nominal coverage rate when the sample size of control runs is small, e.g., L = 50. These results showed the robustness of the EE method against non-Gaussian error distribution and the necessity of relaxing the well-accepted assumption a = 1. The advantage of the EE method would be even more obvious when the tails of the multivariate t distribution are heavier with smaller degrees of freedom.

We also investigated the performance of the EE method with a larger number of sites (S = 108), with ϵ following a multivariate normal or t distribution (see section S2). Results suggest that the EE method maintains its unbiasedness, efficiency, and validity in almost all cases. Even though a minor bias is observed for the NAT forcing when the measurement error is large (m = 20) and the number of control runs is small (L = 50), it becomes negligible with the increase of L. For TLS-TS and TLS-PBC, the NAT forcing is also a challenging case, especially when ϵ follows a multivariate t distribution; the bias and RMSE are significantly increased compared to their performances with S = 54. In general, the EE method retains its advantages with a large S.

In typical OF analyses, the ANT forcing is often obtained by subtracting the NAT forcing from the ALL forcing, which causes correlated measurement errors between ANT and NAT. Thus, to apply the EE method, we suggest estimating βALL and βNAT first and then constructing the estimators for βANT and βNAT (see details in section 2a). Although the estimating procedure is slightly different than the main simulation results presented in section 2b, which assumes that the two forcings are independent, we conducted additional simulations in section S2 mimicking the data generation and processing procedure in section 2b. The results suggest that the EE method is valid as expected and maintains its advantages over TLS methods.

As our EE method has assumed temporal stationarity (assumption 3), we further investigate if the same assumption would also improve the prevailing TLS estimator. The results in section S2 shows that the temporal stationarity assumption does have an effect in reducing the RMSE of the point estimator compared to TLS-TS and TLS-PBC at the regional scale with Gaussian errors; however, the RMSE is still larger than that of the EE estimator in most settings. Also, imposing the temporal stationarity assumption does not seem to reduce the RMSE at the global scale, possibly due to a higher spatial dimension. When the regression errors are heavy tailed (i.e., t distributed), the adapted TLS estimator is also less efficient than the EE. For confidence intervals, we assess the performance of TLS adapted with the proposed pseudo bootstrap procedure, as the prevailing TLS could not provide confidence intervals with the desired coverage rates. The results in section S2 suggest that the modified TLS method has close-to-nominal coverage rate with Gaussian errors, but it could be very conservative when the error distribution is non-Gaussian (e.g., t distribution), leading to a surprisingly large interval width. Therefore, the stationarity assumption does improve TLS-type methods overall, but our proposed EE method has clear advantages.

4. Application

To demonstrate the performance of the proposed approach in real-world applications, we conducted optimal fingerprinting analyses on the annual mean near-surface air temperature at the global (GL), continental, and subcontinental scales during 1951–2020 (Zhang et al. 2006). At the continental (and larger) scale, we consider the Northern Hemisphere (NH), NH midlatitude between 30° and 70° (NHM), Eurasia (EA), and North America (NA). At the subcontinental scale, we consider western North America (WNA), central North America (CNA), ENA, southern Canada (SCA), and southern Europe (SEU) (Giorgi and Francisco 2000).

a. Data preparation

Our observational data Y came from the HadCRUT4 dataset (Morice et al. 2012), which contains monthly anomalies of near-surface air temperature on 5° × 5° grid boxes relative to the 30-yr average over 1961–90. The annual mean temperature anomalies were computed from the monthly values. The annual value was considered missing if there were more than 3 monthly values missing in the year. Our analyses were conducted on nonoverlapping 5-yr mean temperature anomalies. If there were at least three annual values available within a 5-yr period, we computed the 5-yr average temperatures. After removal of the 1961–65 period due to centering, we have T = 13 values of 5-yr averages at each grid box. The spatial dimension is also reduced for the global and continental scale analyses by averaging available 5-yr 5° × 5° grid boxes within a larger box. The final box sizes used in the global and continental scale analyses are as follows: GL and NH, 40° × 30°; NHM, 40° × 10°; EA, 10° × 20°; and NA, 10° × 5°. For the subcontinental scale analyses, we only aggregated the boxes of SCA to 10° × 10° boxes. Details about the boundaries of the regions and data availability are summarized in appendix F.

We obtained estimated signals and control runs from large ensemble simulations conducted with CanESM5 climate model simulations (Swart et al. 2019). Specifically, let X˜NAT be the estimated NAT signal from CanESM5 simulations with 50 runs. Since the ALL forcing in CanESM5 ended in 2014, we appended the simulations using those under the ssp245 forcing (Danabasoglu 2019). The estimated ALL signal XALL was obtained from 50 such appended runs. During the processing, the same missing pattern of Y was imposed, and the same average and aggregation procedures of Y were applied to obtain X˜ALL and X˜NAT. Since they were centered by the 30-yr average over 1961–90, the first 5-yr period 1961–90 was excluded as well. The control runs were obtained from 50 runs under solar and volcanic forcings only, 30 runs under anthropogenic aerosols only, and 50 runs under greenhouse gas only from CanESM5 simulations; after centering under each forcing, the intraensemble variation gave 127 runs. At each grid box, a long-term linear trend was removed from the control simulations from each climate model separately.

The scaling factor of the ANT forcing is of the most interest. In typical OF analyses, one would obtain X˜ANT from subtracting X˜ALL by X˜NAT and estimate the scaling factors of ANT and NAT directly. Such transformation leads to correlated measurement errors ν1 and ν2, which violates assumption 1. Thus, we first fit the model and obtain the estimated scaling factors β^ALL and β^NAT; then the estimated scaling factor for the ANT forcing is β^ANT=β^ALL, and the estimated scaling factor for the NAT forcing is β^ALL+β^NAT. The variances and confidence intervals can be obtained according to the respective linear transformation of β^ALL and β^NAT.

b. Estimation results

Figure 1 shows the estimated ANT and NAT scaling factors with 90% confidence intervals based on TLS-TS, TLS-PBC, and the proposed EE method. For both forcings, the point estimators of the three methods are different, especially on the subcontinental scale. Given the unbiasedness and robustness of the EE method demonstrated in simulation studies, its results are more trustable. It is also worth noting that the point estimators for the ANT forcing are around 0.5 across almost all regions, indicating that the signal of the simulated ANT forcing in the CanESM5 model is twice as big in magnitude as expected from observations. This discovery is consistent with existing literature indicating that CanESM4 warms foo fast (Gillett et al. 2021, Fig. 2). Regarding the confidence intervals, the EE method provides narrower intervals than TLS-PBC on most scales, which is supported by the potential conservativeness of the TLS-PBC intervals observed in simulation studies and discussed in Li et al. (2021). The often-narrower intervals of TLS-TS are questionable owing to the undercoverage issue discussed in the simulation studies.

Fig. 1.
Fig. 1.

Estimated scaling factors with 90% confidence intervals of the ANT and NAT forcings at the global, continental, and subcontinental scales during 1951–2020. The confidence intervals for TLS-TS and TLS-PBC in WNA, ENA, and SCA regions are truncated for better visualization.

Citation: Journal of Climate 36, 20; 10.1175/JCLI-D-22-0681.1

Across the detection and attribution analyses from the three methods, results are similar for the ANT forcing and substantially different for the NAT forcing, owing to its weaker signal. For the ANT forcing, TLS-PBC and the EE lead to both detection and attribution statements at WNA, while TLS-TS only supports the detection statement. At the ENA region, only TLS-PBC leads to the detection statement, while the other two methods do not, as their confidence intervals cover 0. In other regions, all three methods support the detection but not attribution statements for the ANT forcing. For the NAT forcing, the EE method claims its detection and attribution at NH, NHM, and NA regions; TLS-TS supports the detection and attribution statements at NA, CNA, WNA, and SEU regions; TLS-PBC does not support either detection or attribution statements in any region. Although we cannot claim which method is more accurate for this single analysis, we believe that a thorough comparison of the performances of these three methods has been reflected in the simulation studies.

c. Model diagnostics

Table 3 summarizes the p values of the three hypotheses described in section 2d. In particular, we consider two scenarios where the prewhitened residuals are calculated based on 1) estimated a^ or 2) prefixed a = 1. The significance level is set as α = 0.05. The estimated a^ is not close to 1 in most regions, except NHM and CNA. Accordingly, for those regions except NHM and CNA, we observe that both H01 and H02 (assuming a = 1) are rejected, and H02 based on estimated a^ is not rejected as expected. Thus, the assumption that a = 1 is violated for most regions in this dataset. Table 3 also supports the conclusion that there is no spatial autocorrelation in r*(β^) in most regions. Hence, applying the proposed EE method to this dataset is reasonable, as all assumptions are valid, while the results of the two TLS methods are questionable owing to the violation of a = 1 in most regions.

Table 3.

Estimated a and p values of model diagnostic tests for the EE method.

Table 3.

5. Discussion

Our methodological contributions are threefold. First, we propose an efficient, bias-corrected EE approach to estimate the scaling factors in OF. Under the temporal stationarity assumption about the natural internal variability, Σ has a block Toeplitz structure, with each time point as a block, which greatly reduces the number of parameters in Σ. The diagonal blocks of Σ, which capture the spatial dependence, can be estimated more reliably than the other blocks of Σ. Although we discarded temporal dependence in the EE method, which may lead to efficiency loss, the gain due to much-reduced uncertainty in estimating spatial dependence has resulted in a much-improved estimator. The same assumption applied to the TLS method also leads to improvement, but the EE method still has advantages when the spatial dimension is higher or the error distribution is heavy tailed. Unlike approaches that rely on distributional specifications (Hannart 2016; Katzfuss et al. 2017), no distributional assumption beyond the first two moments is needed. Our second contribution is the valid confidence intervals for the scaling factors with close-to-nominal coverage rates. The confidence intervals are constructed with a novel pseudo residual bootstrap method that takes advantage of the available control runs that preserve both spatial and temporal dependences. This pseudo residual bootstrap method provides close-to-nominal coverage rates at a much lower computation cost and with no distributional assumption on the regression errors compared to the TLS-PBC approach of Li et al. (2021). The nonparametric feature of the procedure makes it applicable to other methods too, such as TLS. Finally, our method incorporates a scale parameter a in the variances to account for the proportion of the variances among different climate simulation models, which provides additional flexibility. We further provide different ways to test the commonly made assumption a = 1 as part of model diagnostics. When the assumption a = 1 is violated, our simulation results in the supplemental material suggest the TLS methods could be heavily biased.

The proposed OF framework is promising as a solid, easy-to-implement alternative to the TLS approach in practice. The undercoverage of the confidence intervals from the TLS approach has not attracted attention until recently (DelSole et al. 2019; Li et al. 2021), but with no completely satisfying solutions. Our methods give not only point estimates with a smaller RMSE, but also confidence intervals with the desired coverage rates. Further, we have weaker variance scale assumptions on climate model simulations under external forcings and much lower computing costs. The methods could be a reasonable solution to the long-overlooked coverage issue of the confidence intervals. When the sample size of control runs L is insufficient, the runs under each external forcing of interest could be centered and appropriately scaled to supplement the control runs in estimating B and, hence, the variance of β^T. When a ≠ 1, both TLS-TS and TLS-PBC have biased point estimators, while the EE method can estimate a and remains valid in confidence interval coverage rates. In practice, as the application shown in section 2b, the qualitative conclusions of detection and attribution from TLS may be the same as those from EEs. Nevertheless, a reexamination of the main results supporting the attribution assessment of Eyring et al. (2021) using the EE method would not be a natural yet much feasible task with our software implementation.

The EE approach can be extended in several directions. The temporal aggregation, such as the 5-yr average, may be relaxed. One could use annual data with a more sophisticated model for the signals under each forcing, such as B splines, as used in Wang et al. (2021). The use of control runs to estimate internal variability assumes that internal climate variability is not affected by external forcing, which may need careful consideration for regional data. Several works suggest that internal climate variability may change regionally (Bonan et al. 2021; Swart et al. 2015), and estimating it from forced run residuals rather than from control runs could potentially address this issue (Ribes et al. 2013). The climate model differences were discarded in our study. That is, the mj runs under the jth forcing were treated as if they were the same, while in reality, they can be from different climate models. A more realistic model could incorporate a random effect to capture the heterogeneity among different climate models in estimating the signals under each external forcing, which will also allow pooling simulations from different models to reach a larger sample size. Adding random effects into the EE approach merits further investigation. Similarly, the scale parameter a could also be model specific if multiple climate models are considered.

Acknowledgments.

We are grateful for the feedback from participants in the presentation of an earlier version of this work at the International Detection and Attribution Group (IDAG) virtual seminar series. We thank Dr. Timothy DelSole, Dr. Francis Zwiers, and Dr. Dáithí Stone for their constructive suggestions.

Data availability statement.

The data presented in numerical studies are available online as follows: 1) CMIP5, https://pcmdi.llnl.gov/mips/cmip5/; 2) HadCRUT4, https://www.metoffice.gov.uk/hadobs/hadcrut4/; and 3) CanESM5, https://crd-data-donnees-rdc.ec.gc.ca/.

APPENDIX A

Basics of Estimating Equations

Before introducing the general setting, consider the standard linear regression setting. Let {(Yi, Xi): i = 1, …, n} be a random sample of response variable Y and a p × 1 covariate vector X. The regression model is
E(Yi|Xi)=XiTβ,
where β is a p × 1 coefficient vector to be estimated. The least squares estimator β^n minimizes the sum of squares as follows:
β^n=argminβ1nin(YiXiTβ)2.
The solution is equivalently obtained by equating the first derivative of the objective function with respect to β to zero,
1ni=1nXi(YiXiTβ)=0,
and solving for β. Equation (A2), also known as the normal equation of the least squares problem, is an EE. Its left-hand side is called an estimating function. The estimating function has an expectation of zero, as specified by the linear regression [Eq. (A1)], which only specifies a moment condition (mean). If the regression error ϵi=YiXiT is further assumed to be normally distributed, Eq. (A2) also coincides with the score equation (first derivative of the log likelihood equated to zero). Nonetheless, the EE estimator is robust to distributional misspecification because Eq. (A2) specifies nothing beyond the expectation of the regression error.
The method of EEs is a general strategy for parameter estimation in statistical applications. The estimator is the root (zero) of a set of data-dependent functions called estimating functions. In particular, let {Xi:i=1,,n} be the observed data of sample size n, which are not necessarily independent copies; let θ be a p × 1 parameter vector to be estimated. Consider a p-dimensional estimating function G(θ;Xi), i = 1, … , n, which depends on both θ and the data. If the expectation of G(θ;Xi) with respect to Xi is zero, then we have an EE of
Φn(θ)=1ni=1nG(θ;Xi)=0.
The estimator θ^n is the root of Eq. (A3), i.e., Ψn(θ^)=0.

The framework of EEs is very general. When the likelihood is available from a fully specified model, Eq. (A3) can be the score equation (the first derivative of the log likelihood). In scenarios where the estimator is obtained by optimizing an objective function (e.g., least squares, minimum risk), Eq. (A3) can be the first derivative of the objective function with respect to θ. Nonetheless, Eq. (A3) needs not correspond to an optimization problem. The only requirement is E[G(θ;Xi)]=0, i = 1, … , n, which is much weaker than full likelihood specification. Moment conditions are often used to construct estimating functions. This is why EE estimators are more robust than likelihood estimators in general.

The asymptotic properties of θ^n as n → ∞ can be derived under fairly general regularity conditions. The estimator θ^n converges to the true θ in probability, i.e., θ^n is asymptotically unbiased. The distributional properties of θ^n are inherited from the behavior of the estimating functions. A first-order Taylor approximation of Φn(θ^n) around the true θ gives
0=Φn(θ^n)Φn(θ)+Φ˙n(θ)(θ^nθ),
where Φ˙n=Φn(θ)/θ°. Therefore,
n(θ^nθ)[Φ˙n(θ)]1[nΦn(θ)],
where the first term (an average) converges in probability to the true expectation of the estimating function G˙(θ)=E[G˙(θ;Xi)], and the second term is asymptotically normal by the central limit theorem (with possibly dependent data) with variance V(θ)=limncov[nΦn(θ)]. So the asymptotic variance of n(θ^nθ) has a sandwich form G˙1(θ)V(θ)[G˙1(θ)]T. An estimator of the asymptotic variance can be obtained by plugging the unknown θ by its estimator θ^n.

In the OF context, we constructed unbiased EEs after deriving the bias in appendix B. The sample size n is the number of temporal points T, where each time point is treated as a cluster. The data Xt from cluster t contain Yt and X˜t in section 2a. Because of the temporal dependence, the middle matrix V(θ) in the sandwich matrix is hard to estimate, which motivated our pseudo bootstrap procedure in section 2c.

APPENDIX B

Bias from Using X˜ in Place of X

The EE [Eq. (3)] when X is substituted by X˜ is no longer unbiased. As assumption 2 and assumption 3 indicate the measurement errors are also stationary, we only need to derive the EE for one cluster. For cluster t, let ν(t) = (νt1, … , νtJ), which is an S × J matrix. We have
E{X˜tTΣt1(YtX˜tβ)}=E([Xt+ν(t)]TΣt1{Yt[Xt+ν(t)]β})=E[ν(t)TΣt1ν(t)]β=diag{E(νt1TΣt1νt1),,E(νtJTΣt1νtJ)}β=diag{tr[Σt1E(νt1νt1T)],,tr[Σt1E(νtJνtJT)]}β=diag{tr(Σt1Ωt1),,tr(Σt1ΩtJ)}β=aSdiag(1m1,,1mJ)β.
The third equation above is because νtj and νtj are uncorrelated for jj′. In the last equation, we are able to drop the t indices because of assumption 2. This derivation is the basis for constructing unbiased EEs in terms of X˜.

APPENDIX C

Estimation of a

Define the block residual rt(β)=YtX˜tβ for t = 1, … , T; then
var{rt(β)}=var(YtX˜tβ)=var(Xtβ+ϵXtβνβ)=var(ϵ)+var(νβ)=δ2(β,a)Σt,
where δ(β,a)=(1+aj=1Jβj2/mj)1/2. Note that
var{Ψt1/2rt(β)}=Ψt1/2δ2(β,a)ΣtΨt1/2=δ2(β,a)Σt1/2ΣtaΣt1/2=δ2(β,a)a=(1+aj=1Jβj2/mj)a.
A feasible EE of a is
aSr,T2(1+aj=1Jβ^T,j2/mj)=0,
where Sr,T2 is the sample variance of Ψ^+1/2rt(β^). A closed-form estimator of a is
a^T=1Sr,T2j=1Jβ^Tj2/mj.
From the derivation, a^T is asymptotically consistent for the true a, i.e., it converges to a as T → ∞. The variance a^T can be obtained jointly with β^T, which can be used to construct confidence intervals and conduct a hypothesis test about a; see details in the supplemental material.

APPENDIX D

Asymptotic Normality of β^T

Since β^T solves the EE [Eq. (6)], a Taylor expansion of the equation at the true parameter value β gives
0=T1/2t=1TGt(β^T;Σ^+)=T1/2t=1TGt(β;Σ^+)+{T1t=1TGt(β;Σ^+)/βT}×T1/2(β^Tβ)+op(1).
Thus,
T1/2(β^Tβ)={T1t=1TGt(β;Σ^+)/βT}1×T1/2t=1TGt(β)+op(1).
Since Gt is stationary over time, under mild conditions, T1/2t=1TGt(β) converges in distribution to a normal distribution N(0, B). Further, T1t=1TGt(β;Σ^+)/βTA by the law of large numbers. The asymptotic normality of β^T then follows.

APPENDIX E

Constructing Σ with Block Toeplitz Structure

In the simulation, the true covariance with a block Toeplitz structure was constructed in the following way. First, we calculated the sample covariance matrix from control runs, denoting it as Σ˜. Then, we imposed a block Toeplitz structure to Σ˜ by taking the average to main diagonal blocks and each off-diagonal block, respectively. If necessary, we truncated some terms of off-diagonal blocks to make the temporal correlation decay as the time lag increases. Finally, the linear shrinkage method was used to estimate the true Σ based on Σ˜ so that it is positive definite.

APPENDIX F

Details of the 10 Spatial Scales in Section 4

The names, coordinate ranges, spatiotemporal dimensions, and dimension of observation after removing missing values of the 10 spatial scales in section 4 are presented in Table F1.

Table F1.

Details of the names, coordinate ranges, spatiotemporal dimensions (S and T), and dimension of observation (ST) after removing missing values of the 10 scales in section 4.

Table F1.

REFERENCES

  • Allen, M. R., and S. F. B. Tett, 1999: Checking for model consistency in optimal fingerprinting. Climate Dyn., 15, 419434, https://doi.org/10.1007/s003820050291.

    • Search Google Scholar
    • Export Citation
  • Allen, M. R., and P. A. Stott, 2003: Estimating signal amplitudes in optimal fingerprinting, part I: Theory. Climate Dyn., 21, 477491, https://doi.org/10.1007/s00382-003-0313-9.

    • Search Google Scholar
    • Export Citation
  • Bindoff, N. L., and Coauthors, 2013: Detection and attribution of climate change: From global to regional. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 867–952.

  • Bonan, D. B., F. Lehner, and M. M. Holland, 2021: Partitioning uncertainty in projections of Arctic Sea ice. Environ. Res. Lett., 16, 044002, https://doi.org/10.1088/1748-9326/abe0ec.

    • Search Google Scholar
    • Export Citation
  • Carroll, R. J., D. Ruppert, L. A. Stefanski, and C. M. Crainiceanu, 2006: Measurement Error in Nonlinear Models: A Modern Perspective. 2nd ed. CRC Press, 488 pp.

  • Danabasoglu, G., 2019: NCAR CESM2 model output prepared for CMIP6 ScenarioMIP ssp245. Earth System Grid Federation, accessed 9 March 2022, https://doi.org/10.22033/ESGF/CMIP6.7748.

  • DelSole, T., L. Trenary, X. Yan, and M. K. Tippett, 2019: Confidence intervals in optimal fingerprinting. Climate Dyn., 52, 41114126, https://doi.org/10.1007/s00382-018-4356-3.

    • Search Google Scholar
    • Export Citation
  • Eyring, V., and Coauthors, 2021: Human influence on the climate system. Climate Change 2021: The Physical Science Basis, V. Masson-Delmotte et al., Eds., Cambridge University Press, 423–552.

  • Gillett, N. P., and Coauthors, 2021: Constraining human contributions to observed warming since the pre-industrial period. Nat. Climate Change, 11, 207212, https://doi.org/10.1038/s41558-020-00965-9.

    • Search Google Scholar
    • Export Citation
  • Giorgi, F., and R. Francisco, 2000: Uncertainties in regional climate change prediction: A regional analysis of ensemble simulations with the HADCM2 coupled AOGCM. Climate Dyn., 16, 169182, https://doi.org/10.1007/PL00013733.

    • Search Google Scholar
    • Export Citation
  • Gneiting, T., and A. E. Raftery, 2007: Strictly proper scoring rules, prediction, and estimation. J. Amer. Stat. Assoc., 102, 359378, https://doi.org/10.1198/016214506000001437.

    • Search Google Scholar
    • Export Citation
  • Godambe, V. P., 1991: Estimating Functions. Oxford University Press, 356 pp.

  • Hannart, A., 2016: Integrated optimal fingerprinting: Method description and illustration. J. Climate, 29, 19771998, https://doi.org/10.1175/JCLI-D-14-00124.1.

    • Search Google Scholar
    • Export Citation
  • Hegerl, G. C., H. von Storch, K. Hasselmann, B. D. Santer, U. Cubasch, and P. D. Jones, 1996: Detecting greenhouse-gas-induced climate change with an optimal fingerprint method. J. Climate, 9, 22812306, https://doi.org/10.1175/1520-0442(1996)009<2281:DGGICC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hegerl, G. C., and Coauthors, 2007: Understanding and attributing climate change. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 663–745.

  • Heyde, C. C., 2008: Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation. Springer Science and Business Media, 246 pp.

  • Huntingford, C., P. A. Stott, M. R. Allen, and F. H. Lambert, 2006: Incorporating model uncertainty into attribution of observed temperature change. Geophys. Res. Lett., 33, L05710, https://doi.org/10.1029/2005GL024831.

    • Search Google Scholar
    • Export Citation
  • Katzfuss, M., D. Hammerling, and R. L. Smith, 2017: A Bayesian hierarchical model for climate change detection and attribution. Geophys. Res. Lett., 44, 57205728, https://doi.org/10.1002/2017GL073688.

    • Search Google Scholar
    • Export Citation
  • Li, H., C. A. Calder, and N. Cressie, 2007: Beyond Moran’s I: Testing for spatial dependence based on the spatial autoregressive model. Geogr. Anal., 39, 357375, https://doi.org/10.1111/j.1538-4632.2007.00708.x.

    • Search Google Scholar
    • Export Citation
  • Li, Y., K. Chen, J. Yan, and X. Zhang, 2021: Uncertainty in optimal fingerprinting is underestimated. Environ. Res. Lett., 16, 084043, https://doi.org/10.1088/1748-9326/ac14ee.

    • Search Google Scholar
    • Export Citation
  • Li, Y., K. Chen, J. Yan, and X. Zhang, 2023: Regularized fingerprinting in detection and attribution of climate change with weight matrix optimizing the efficiency in scaling factor estimation. Ann. Appl. Stat., 17, 225239, https://doi.org/10.1214/22-AOAS1624.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., and J. Xie, 2020: Cauchy combination test: A powerful test with analytic p-value calculation under arbitrary dependency structures. J. Amer. Stat. Assoc., 115, 393402, https://doi.org/10.1080/01621459.2018.1554485.

    • Search Google Scholar
    • Export Citation
  • Moran, P. A., 1950: Notes on continuous stochastic phenomena. Biometrika, 37, 1723, https://doi.org/10.2307/2332142.

  • Morice, C. P., J. J. Kennedy, N. A. Rayner, and P. D. Jones, 2012: Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set. J. Geophys. Res., 117, D08101, https://doi.org/10.1029/2011JD017187.

    • Search Google Scholar
    • Export Citation
  • Paradis, E., J. Claude, and K. Strimmer, 2004: APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20, 289290, https://doi.org/10.1093/bioinformatics/btg412.

    • Search Google Scholar
    • Export Citation
  • Pešta, M., 2013: Total least squares and bootstrapping with applications in calibration. Statistics, 47, 966991, https://doi.org/10.1080/02331888.2012.658806.

    • Search Google Scholar
    • Export Citation
  • Ribes, A., S. Planton, and L. Terray, 2013: Application of regularised optimal fingerprinting to attribution. Part I: Method, properties and idealised analysis. Climate Dyn., 41, 28172836, https://doi.org/10.1007/s00382-013-1735-7.

    • Search Google Scholar
    • Export Citation
  • Swart, N. C., J. C. Fyfe, E. Hawkins, J. E. Kay, and A. Jahn, 2015: Influence of internal variability on Arctic sea-ice trends. Nat. Climate Change, 5, 8689, https://doi.org/10.1038/nclimate2483.

    • Search Google Scholar
    • Export Citation
  • Swart, N. C., and Coauthors, 2019: The Canadian Earth System Model version 5 (CanESM5.0.3). Geosci. Model Dev., 12, 48234873, https://doi.org/10.5194/gmd-12-4823-2019.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., Y. Jiang, H. Wan, J. Yan, and X. Zhang, 2021: Toward optimal fingerprinting in detection and attribution of changes in climate extremes. J. Amer. Stat. Assoc., 116 (533), 113, https://doi.org/10.1080/01621459.2020.1730852.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., F. W. Zwiers, and P. A. Stott, 2006: Multimodel multisignal climate change detection at regional scale. J. Climate, 19, 42944307, https://doi.org/10.1175/JCLI3851.1.

    • Search Google Scholar
    • Export Citation

Supplementary Materials

Save
  • Allen, M. R., and S. F. B. Tett, 1999: Checking for model consistency in optimal fingerprinting. Climate Dyn., 15, 419434, https://doi.org/10.1007/s003820050291.

    • Search Google Scholar
    • Export Citation
  • Allen, M. R., and P. A. Stott, 2003: Estimating signal amplitudes in optimal fingerprinting, part I: Theory. Climate Dyn., 21, 477491, https://doi.org/10.1007/s00382-003-0313-9.

    • Search Google Scholar
    • Export Citation
  • Bindoff, N. L., and Coauthors, 2013: Detection and attribution of climate change: From global to regional. Climate Change 2013: The Physical Science Basis, T. F. Stocker et al., Eds., Cambridge University Press, 867–952.

  • Bonan, D. B., F. Lehner, and M. M. Holland, 2021: Partitioning uncertainty in projections of Arctic Sea ice. Environ. Res. Lett., 16, 044002, https://doi.org/10.1088/1748-9326/abe0ec.

    • Search Google Scholar
    • Export Citation
  • Carroll, R. J., D. Ruppert, L. A. Stefanski, and C. M. Crainiceanu, 2006: Measurement Error in Nonlinear Models: A Modern Perspective. 2nd ed. CRC Press, 488 pp.

  • Danabasoglu, G., 2019: NCAR CESM2 model output prepared for CMIP6 ScenarioMIP ssp245. Earth System Grid Federation, accessed 9 March 2022, https://doi.org/10.22033/ESGF/CMIP6.7748.

  • DelSole, T., L. Trenary, X. Yan, and M. K. Tippett, 2019: Confidence intervals in optimal fingerprinting. Climate Dyn., 52, 41114126, https://doi.org/10.1007/s00382-018-4356-3.

    • Search Google Scholar
    • Export Citation
  • Eyring, V., and Coauthors, 2021: Human influence on the climate system. Climate Change 2021: The Physical Science Basis, V. Masson-Delmotte et al., Eds., Cambridge University Press, 423–552.

  • Gillett, N. P., and Coauthors, 2021: Constraining human contributions to observed warming since the pre-industrial period. Nat. Climate Change, 11, 207212, https://doi.org/10.1038/s41558-020-00965-9.

    • Search Google Scholar
    • Export Citation
  • Giorgi, F., and R. Francisco, 2000: Uncertainties in regional climate change prediction: A regional analysis of ensemble simulations with the HADCM2 coupled AOGCM. Climate Dyn., 16, 169182, https://doi.org/10.1007/PL00013733.

    • Search Google Scholar
    • Export Citation
  • Gneiting, T., and A. E. Raftery, 2007: Strictly proper scoring rules, prediction, and estimation. J. Amer. Stat. Assoc., 102, 359378, https://doi.org/10.1198/016214506000001437.

    • Search Google Scholar
    • Export Citation
  • Godambe, V. P., 1991: Estimating Functions. Oxford University Press, 356 pp.

  • Hannart, A., 2016: Integrated optimal fingerprinting: Method description and illustration. J. Climate, 29, 19771998, https://doi.org/10.1175/JCLI-D-14-00124.1.

    • Search Google Scholar
    • Export Citation
  • Hegerl, G. C., H. von Storch, K. Hasselmann, B. D. Santer, U. Cubasch, and P. D. Jones, 1996: Detecting greenhouse-gas-induced climate change with an optimal fingerprint method. J. Climate, 9, 22812306, https://doi.org/10.1175/1520-0442(1996)009<2281:DGGICC>2.0.CO;2.

    • Search Google Scholar
    • Export Citation
  • Hegerl, G. C., and Coauthors, 2007: Understanding and attributing climate change. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 663–745.

  • Heyde, C. C., 2008: Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation. Springer Science and Business Media, 246 pp.

  • Huntingford, C., P. A. Stott, M. R. Allen, and F. H. Lambert, 2006: Incorporating model uncertainty into attribution of observed temperature change. Geophys. Res. Lett., 33, L05710, https://doi.org/10.1029/2005GL024831.

    • Search Google Scholar
    • Export Citation
  • Katzfuss, M., D. Hammerling, and R. L. Smith, 2017: A Bayesian hierarchical model for climate change detection and attribution. Geophys. Res. Lett., 44, 57205728, https://doi.org/10.1002/2017GL073688.

    • Search Google Scholar
    • Export Citation
  • Li, H., C. A. Calder, and N. Cressie, 2007: Beyond Moran’s I: Testing for spatial dependence based on the spatial autoregressive model. Geogr. Anal., 39, 357375, https://doi.org/10.1111/j.1538-4632.2007.00708.x.

    • Search Google Scholar
    • Export Citation
  • Li, Y., K. Chen, J. Yan, and X. Zhang, 2021: Uncertainty in optimal fingerprinting is underestimated. Environ. Res. Lett., 16, 084043, https://doi.org/10.1088/1748-9326/ac14ee.

    • Search Google Scholar
    • Export Citation
  • Li, Y., K. Chen, J. Yan, and X. Zhang, 2023: Regularized fingerprinting in detection and attribution of climate change with weight matrix optimizing the efficiency in scaling factor estimation. Ann. Appl. Stat., 17, 225239, https://doi.org/10.1214/22-AOAS1624.

    • Search Google Scholar
    • Export Citation
  • Liu, Y., and J. Xie, 2020: Cauchy combination test: A powerful test with analytic p-value calculation under arbitrary dependency structures. J. Amer. Stat. Assoc., 115, 393402, https://doi.org/10.1080/01621459.2018.1554485.

    • Search Google Scholar
    • Export Citation
  • Moran, P. A., 1950: Notes on continuous stochastic phenomena. Biometrika, 37, 1723, https://doi.org/10.2307/2332142.

  • Morice, C. P., J. J. Kennedy, N. A. Rayner, and P. D. Jones, 2012: Quantifying uncertainties in global and regional temperature change using an ensemble of observational estimates: The HadCRUT4 data set. J. Geophys. Res., 117, D08101, https://doi.org/10.1029/2011JD017187.

    • Search Google Scholar
    • Export Citation
  • Paradis, E., J. Claude, and K. Strimmer, 2004: APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20, 289290, https://doi.org/10.1093/bioinformatics/btg412.

    • Search Google Scholar
    • Export Citation
  • Pešta, M., 2013: Total least squares and bootstrapping with applications in calibration. Statistics, 47, 966991, https://doi.org/10.1080/02331888.2012.658806.

    • Search Google Scholar
    • Export Citation
  • Ribes, A., S. Planton, and L. Terray, 2013: Application of regularised optimal fingerprinting to attribution. Part I: Method, properties and idealised analysis. Climate Dyn., 41, 28172836, https://doi.org/10.1007/s00382-013-1735-7.

    • Search Google Scholar
    • Export Citation
  • Swart, N. C., J. C. Fyfe, E. Hawkins, J. E. Kay, and A. Jahn, 2015: Influence of internal variability on Arctic sea-ice trends. Nat. Climate Change, 5, 8689, https://doi.org/10.1038/nclimate2483.

    • Search Google Scholar
    • Export Citation
  • Swart, N. C., and Coauthors, 2019: The Canadian Earth System Model version 5 (CanESM5.0.3). Geosci. Model Dev., 12, 48234873, https://doi.org/10.5194/gmd-12-4823-2019.

    • Search Google Scholar
    • Export Citation
  • Wang, Z., Y. Jiang, H. Wan, J. Yan, and X. Zhang, 2021: Toward optimal fingerprinting in detection and attribution of changes in climate extremes. J. Amer. Stat. Assoc., 116 (533), 113, https://doi.org/10.1080/01621459.2020.1730852.

    • Search Google Scholar
    • Export Citation
  • Zhang, X., F. W. Zwiers, and P. A. Stott, 2006: Multimodel multisignal climate change detection at regional scale. J. Climate, 19, 42944307, https://doi.org/10.1175/JCLI3851.1.

    • Search Google Scholar
    • Export Citation
  • Fig. 1.

    Estimated scaling factors with 90% confidence intervals of the ANT and NAT forcings at the global, continental, and subcontinental scales during 1951–2020. The confidence intervals for TLS-TS and TLS-PBC in WNA, ENA, and SCA regions are truncated for better visualization.

All Time Past Year Past 30 Days
Abstract Views 91 0 0
Full Text Views 3914 2764 376
PDF Downloads 1927 1143 62