## 1. Introduction

The characterization of large-scale extratropical atmospheric variability is a matter of open debate. A paradigmatic view is to regard the low-frequency projection of the atmospheric attractor as a superposition of dynamical regimes (Dole 1983).

It has been suggested and argued that large-scale atmospheric variability at the monthly time scale over the North Atlantic–European region (NAE) is characterized by transition or permanence among four dynamical winter regimes: the positive phase of the North Atlantic Oscillation (NAO+), the corresponding negative phase (NAO−), the Greenland–Scandinavian dipole, and the Atlantic anticyclonic ridge (Cassou et al. 2004). These regimes, obtained by cluster analysis, are not exactly organized symmetrically around the climatology. Therefore, they are asymmetric, as normally occurs in chaotic nonlinearly forced systems (Palmer 1999). Moreover, composites of anomaly surface forcings [e.g., surface sea temperature (SST)], computed for opposite regimes, show some degree of asymmetry (Robinson et al. 2003; Wu and Hsieh 2004). This has been shown in particular for the tripole Atlantic and Pacific SST forcings of the NAO+ and the NAO− quasi-antisymmetric regimes (Cassou et al. 2004).

The link between average surface climatic conditions (e.g., large-scale precipitation and surface temperature) and middle-tropospheric regimes can be decomposed in terms of a monotonic linear influence and nonlinear terms responsible for possible asymmetric responses. As a consequence, the joint probability density functions (PDFs) of large-scale indexes and climatic variables can express some degree of asymmetry and non-Gaussianity. Some studies corroborate this fact, for example, the nonlinear, asymmetric response of the surface temperature over Europe for symmetric quantiles of the North Atlantic Oscillation (NAO) index (Pozo-Vásquez et al. 2001; Trigo and Palutikof 1999). Another example is the asymmetry of dry and wet self-organizing maps (Cavazos 2000) and their different correlations with the Arctic Oscillation and NAO indexes. Another study hypothesizes the asymmetric response of the Indian Ocean precipitation to the NAO (M. R. P. Sapiano and P. A. Arkin 2005, personal communication).

Motivated by this issue, this paper is a contribution to inferring the degree of non-Gaussianity and asymmetry within the statistical response of the monthly winter [December–February (DJF)] precipitation to the NAO over the NAE.

Both the mean influence of the NAO and its trend on the monthly winter precipitation over the NAE are well documented (Hurrell et al. 2004). That influence is essentially due to (a) the different tracking of synoptic storms in the presence of NAO+ or NAO− regimes (Rogers 1997; Hurrell 1995), and (b) the enhancement, for particular regimes, of local systems associated with particular geographical and orographic conditions, for instance for Greenland and Iceland (Serreze et al. 1997). The linear component of that influence can be assessed through the one-point-linear correlation map between the monthly precipitation and the NAO monthly index, defined as the Lisbon, Portugal, minus Stykkisholmur, Iceland, normalized average monthly sea level pressure (SLP) anomaly, or other correlated quantities (Osborn et al. 1999).

This correlation map shows a dipolar structure with extreme positive correlations near 0.6 to the south of Iceland around 60°N, and extreme negative values near −0.6 over the North Atlantic basin around 40°N latitude.

A diagnostic measure is built in this paper in order to measure the asymmetric part of the precipitation response to NAO, undetected by the linear correlation. For that purpose, side or asymmetric correlations between the NAO index and monthly precipitation are computed. This essentially consists of evaluating conditional correlations for both the positive and negative NAO regimes, thus revealing possible asymmetric or non-Gaussian precipitation responses to NAO. Asymmetry is only a particular aspect of something more general: the non-Gaussianity. In the paper, two methods are used in order to evaluate bivariate non-Gaussian PDFs. Then, we obtain relevant diagnostics from information theory (Shannon 1948), such as negentropy and mutual information (MI), as well as its Gaussian and non-Gaussian counterparts (Kraskov et al. 2004). The first method assumes a weak non-Gaussianity scenario. The non-Gaussianity degree of the joint PDF (NAO, monthly precipitation at the point basis) is evaluated through Edgeworth expansions (Edgeworth 1905; Comon 1994), based on the Hermite polynomials (Abramowitz and Stegun 1972) and higher-order statistical moments. The second method is based in the maximum entropy principle (Jaynes 1982) and is applicable without restriction in terms of the amplitude of non-Gaussianity. PDF evaluation is still possible through the maximum likelihood method (Sivia 1996) and the Kernels estimation (Silverman 1986). Apart from this paper, information theory is used in other applications, such as predictability studies (DelSole 2004), forecast evaluation (Roulston and Smith 2002), independent component analysis of climatological data (Aires et al. 2002), and computation of mutual information among climatic data (Marwan and Kurths 2002).

The paper begins with a theoretical section (section 2) by supplying relevant properties of side correlations and mutual information and its estimators based both on the Edgeworth expansion and the maximum entropy method (ME). Data and processing methodologies are then presented in section 3, followed by results and their analysis in section 4. We then conclude with a discussion in section 5 and an appendix with mathematical developments.

## 2. Theoretical background

### a. General measures of correlation

*X*and

*Y*is the linear or Pearson correlation (Papoulis 1991), represented by

*c*(

*X*,

*Y*). When

*c*(

*X*,

*Y*) ≠ 0, there is common information or statistical redundancy. However, the reverse does not hold, in general, because this measure does not account for nonlinear relationships between variables. Those relations can be taken into account through the Pearson correlation

*c*[

*A*(

*X*),

*B*(

*Y*)] between general nonlinear functions

*A*(

*X*) and

*B(Y*). The correlation is maximized in absolute value when

*A*(

*x*) =

*E*(

*Y*|

*X*=

*x*),

*B*(

*y*) =

*E*(

*X*|

*Y*=

*y*). A particular case of nonlinear correlation is the Spearman or rank correlation (Wilks 1995), where the nonlinear functions are simply the sampling ranks of

*X*and

*Y*, respectively, measuring the degree of monotonic association between both variables. Another nonlinear correlation is hereby denoted as Gaussian correlation and defined as

*g*(equivalently for

_{X}*Y*) is the standard Gaussian transformation or Gaussian anamorphosis of

*X*, given by

*ρ*(

_{X}*u*) is the PDF of

*X*and Φ

^{−1}is the inverse of the cumulative standard Gaussian distribution function. This transformation is a common data analysis procedure in geostatistical kriging and climatic data analysis (Biau et al. 1999) that ensures that the marginal distributions are standardized Gaussians. However, while marginal distributions are rendered Gaussian, the joint distribution does not necessarily become Gaussian. That way, if

*X*and

*Y*have a joint Gaussian distribution,

*c*(

_{g}*X*,

*Y*) =

*c*(

*X*,

*Y*). Both rank and Gaussian correlations are nonlinear correlations that are invariant for the class of monotonous homeomorphisms on

*X*and

*Y*individually (though not necessarily homeomorphisms mixing these variables). Consequently, an advantage of both of these measures over the Pearson correlation is the fact that, unlike the latter, the former are not artificially inflated by the coincidence of large outlier values (Jolliffe and Stephenson 2003). Results will be shown, both for a pair (

*X*,

*Y*) of untransformed variables and for their correspondent Gaussian anamorphosis where

*X*is the standardized (i.e., zero average, unit variance) principal component (PC)-based NAO index (see section 3) and the variable

*Y*is the standardized monthly precipitation.

### b. Asymmetric Gaussian correlations

*Y*) to another (

*X*) is not necessarily the same over every subdomain of

*X*. A global correlation measure is not able to account for the sensitivity of one variable to the other within a particular subdomain. To do so, we introduce the conditional correlation

*c*(

*X*,

*Y*|

*X*∈

*I*), also called asymmetric correlation, between two standard variables

_{X}*X*,

*Y*for a certain interval

*I*of

_{X}*X.*We consider here the partition of

*X*into two complementary intervals, separated by the median

*M*of

_{X}*X*. In particular, for

*X*being the NAO index, that partition separates the NAO− and NAO+ regimes. The corresponding asymmetric or side correlations are defined as

*X*,

*Y*are centered and have unit variance, it is easy to verify that

*X*,

*Y*) relationship. For standard variables, the correlation is simply given by the covariance, which can be decomposed, as for any set partition, into intra- and interset covariances. For the referred

*X*partitioned into two halves, we have

*X*,

*Y*), proportional to the difference between the asymmetric conditional

*Y*means. Given the constraints [(5a), (5b)] it is also clear that |

*c*| ≤ 1 and that |

_{M}*c*| tends to increase for low conditional

_{M}*Y*variance. The other two terms of (6) are also less than one in absolute value.

*c*,

_{M}*c*

_{+}, and

*c*

_{−}with those that would be obtained if

*X*and

*Y*were jointly Gaussian. In this case, the above quantities are functions of the correlation

*c*,

*c*,

*c*| and |

_{+}*c*| are below the absolute value |

_{−}*c*|. This means that, under Gaussian conditions, the global correlation

*c*is always greater in absolute value than both side correlations. Both of their contributions for

*c*are equal to

*βc*in decomposition (6). This leads us to define statistical tests of bi-Gaussianity, hereafter called test central correlation

*t*, test positive side correlation

_{M}*t*

_{+}, and test negative side correlation

*t*

_{−}, which are directly comparable to the correlation in the Gaussian case,

*c*and the tests [(11)] are measures of the distribution asymmetry. The vanishing of all of those differences is a necessary but not sufficient condition for joint Gaussianity. To get a measure of asymmetry that is independent from the correlation

*c*, we will consider the pair of uncorrelated variables (

*X*,

*Y*), where

_{r}*Y*is the standardized residue of the linear prediction of

_{r}*Y*from

*X*,

*X*,

*Y*) correlation as a combination of expectancies as follows:

_{r}*J*of asymmetry, as

_{c}By subtracting *M _{X}* from

*X*and taking the absolute value of the product

*XY*in (14), it can be inferred that

_{r}*J*is proportional to the nonlinear correlation

_{c}*c*(|

*X*−

*M*|,

_{X}*Y*), which has, under some conditions, a monotonic relationship with the nonlinear correlation

_{r}*c*(|

*X*−

*M*|

_{X}^{2},

*Y*).

_{r}### c. Mutual information and negentropy

Beyond asymmetry, a more complete approach to diagnosing non-Gaussianity and statistical redundancy, based on concepts of information theory (Shannon 1948), is considered here.

*X*and

*Y*. Mutual information is nonnegative and measures the reduction of uncertainty of a random variable given the knowledge about the other, and vice versa. Mathematically speaking, MI is defined as

*h*( ) is the differential entropy. Mutual information is the Kullback–Leibler distance (KLD) between the joint PDF

*ρ*

_{X,Y}(

*X*,

*Y*) and the product

*ρ*(

_{X}*X*)

*ρ*(

_{Y}*Y*) of the marginal PDFs (Cover and Thomas 1991). Mutual information vanishes iff (if and only if)

*X*and

*Y*are statistically independent or, equivalently, iff all nonlinear correlations are zero for smooth PDFs, thus making MI a stronger measure of independence than the Pearson correlation. Furthermore, MI is invariant for any

*X*and

*Y*single homeomorphisms. In particular, it is invariant when

*X*and

*Y*are replaced by the corresponding Gaussian anamorphosis, say

*c*, a positive lower MI bound

*I*(hereby denoted as Gaussian mutual information) can be found by solving a constrained variational problem of MI minimization (Kraskov et al. 2004), thus leading to the decomposition

_{g}*X*,

*Y*) with correlation

*c*(Cover and Thomas 1991). The upper bound [(18)] can be generalized by replacing the correlation

*c*by any nonlinear (

*X*,

*Y*) correlation. The Gaussian upper bound [(18)] of MI is also used in speech analysis (Abdallah and Plumbey 2003). Mutual information can be compared with the Gaussian correlation by defining a distance between

*X*and

*Y*, hereby denoted as information correlation, and given by

*X*,

*Y*) is Gaussian or, equivalently, iff I

_{ng}is null. As it happens with MI,

*c*

_{inf}vanishes in the case of statistical independence. By applying the chain rule of KLD (Cover and Thomas 1991), the non-Gaussian term of MI,

*I*

_{ng}, can be decomposed as

*J*( ) is negentropy, a positive quantity defined as the KLD between the true PDF and the Gaussian PDF with the same first- and second-order statistics. In (20), negentropy

*J*(

*X*,

*Y*) is invariant under a two-dimensional linear homeomorphism of (

*X*,

*Y*). Therefore, without loss of generality, it is equal to the negentropy between the uncorrelated variables

*X*and the prediction residue

*Y*[(12)].

_{r}If *X* and *Y* are previously subjected to Gaussian anamorphosis, the marginal negentropies *J*(*X*) and *J*(*Y*) vanish.

### d. Numerical estimation of MI

The numerical estimation of MI is rather difficult and has no unbiased estimators (Paninski 2003, 2004). It can be dealt with through numerous approaches, such as the plug-in, bin-adaptive networks (Kraskov et al. 2004), and ME (Abramov 2006). Here, we will estimate MI through two independent methods: ME, and another one based on the Edgeworth PDF expansion (EDG-PDF; Edgeworth 1905). This method is only reasonable on a weak non-Gaussianity scenario, contrary to the ME, which, on the other hand, is computationally more costly. A summarized account of both methods is given as follows.

#### 1) Edgeworth expansion method

*X*,

*Y*), or any transformed pair such as (

*g*,

_{X}*g*), and then its MI. The fact of taking rotated standard variables

_{Y}*U*=

*X*and

*W*=

*Y*[(12)] from (

_{r}*X*,

*Y*) will considerably simplify the proposed PDF expansion. The joint PDF

*ρ*

_{U,W}(

*u*,

*w*) can be approximated by

*φ*( ) is the standard Gaussian PDF and

*υ*(

_{U,W}*u*,

*w*) is a truncated fitting polynomial vanishing if the joint PDF is Gaussian. As far as the truncation error is concerned, the variables

*U*and

*W*are assumed to be arithmetic averages of an equivalent number

*n*

_{eq}of independent and identically distributed (iid) variables, and

*l*is positive and increases with the truncation order (Comon 1994). The greater

*n*

_{eq}, the smaller the truncation error of the expression and the closer to Gaussianity the joint (

*U*,

*W*) PDF, due to the central limit theorem. The function

*υ*is expanded in terms of Hermite orthogonal polynomials [(A2), (A3)] and cumulants

_{U,W}*k*

^{(p,q)}[(A4)] of order

*p*in

*U*and of order

*q*in

*W*, which are appropriate joint polynomial expectancies of

*U*and

*W*. The cumulants

*k*

^{(p,q)}are scaled as

*O*[

*n*

^{−(p+q)/2+1}

_{eq}]. The function

*υ*

_{U,W}for

*l*= −3/2 is given by (A1) in the appendix. Nonzero cumulants of order

*p*+

*q*, higher than or equal to three, reveal non-Gaussianity. In particular, if

*X*is rendered Gaussian the self

*U*cumulants

*k*

^{(p,0)},

*p*≥ 3 vanish. The EDG-PDF expansion converges in

*L*, needing a large truncation

^{2}*l*in cases of high non-Gaussianity. It has the drawback that errors in tail regions of the distribution may be comparable to the PDF itself and may even present negative values. To verify the positivity and normalization of the EDG-PDF, we numerically compute the integrals

*P*

_{pos}and

*P*

_{neg}of the EDG-PDF, respectively, in the domain of positive and negative values of the estimated truncated density [(21)]. The integrals are estimated by bivariate Gaussian quadrature, mapping the open intervals ]−∞, ∞[ into ]−1, 1[ through the transformation

*x*→

*f*(

*x*) =

*x*/(1 + |

*x*|). The use of 50 weighting quadrature points has been sufficient for the convergence of integrals with an accuracy of ∼10

^{−4}. A satisfactory condition for (21) to be a density is |

*P*

_{neg}| ≪

*P*

_{pos}∼ 1, which holds if cumulants in (A1) are sufficiently small. A nongeneral rule for approaching the referred condition is to perform Gaussian anamorphosis of the original variables. A reduced Edgeworth truncation is still valid considering a “tilted” variable whose modal region is nearer to the neighborhood where we wish to approximate, using the saddle approximation technique (Daniels 1954). However, this approach will be not followed herein. An independent test of validity of the EDG-PDF is obtained after comparing it with the ME-obtained density ME-PDF.

*ρ*(

_{U,W}*u*,

*w*), the PDF

*ρ*(

_{X,Y}*x*,

*y*) of the unrotated variables can be retrieved through the following expression:

*X*,

*Y*) one must obtain the joint negentropy

*J*(

*X*,

*Y*) =

_{r}*J*(

*U*,

*W*), which is simply the KLD distance between

*ρ*

_{U,W}(

*u*,

*w*) and the product

*φ*(

_{U}*u*)

*φ*(

_{W}*w*), reducing to

*J*(

*X*) and

*J*(

*Y*) are estimated through the equivalent equation to (23) for single variables. Two ways of computing (23) are followed. First, it is numerically computed in the domain (

*u*,

*w*): 1 +

*υ*

_{U,W}(

*u*,

*w*) ≥ 0, in the same way as

*P*

_{neg}and

*P*

_{pos}. The non-Gaussian MI obtained through the estimated integral is denoted as

*I*

_{ng(EI)}. Then, we consider the Taylor expansion

*O*(

*υ*

^{3}) is

*O*(

*n*

^{−3/2}

_{eq}) because

*υ*∼

*n*

^{−1/2}

_{eq}. By noting that the function

*υ*

_{U,W}(

*u*,

*w*) is orthogonal to the product of Gaussian PDFs

*ϕ*(

*u*)

*ϕ*(

*w*), the negentropy becomes the following positive quantity:

*l*= 3/2, we have

*U*,

*W*variables. Equation (26) is a sum of the quadratic positive contributions from cumulants of order higher than two, while also resulting from a truncated expansion of the logarithm function. Furthermore, these are a simplification of what had been obtained by Comon (1994), where a truncation of up to

*O*(

*n*

^{−2}

_{eq}) had been considered. While simplifying the order of truncation, a generalization has been performed as to obtain the Edgeworth expansion of the joint negentropy, whereas Comon (1994) had studied the one-dimensional case. The non-Gaussian MI obtained through (26) is denoted by

*I*

_{ng(EF)}. By having an explicit estimation of the PDF and taking into account the integral properties of Hermite polynomials, we can derive an analytic expression for the conditional expectancy of

*Y*given

*X*, as

*Y*from

*X*, also represented by

*Y(*lin), whereas the second rhs term is the additive correction resulting from non-Gaussianity, thus yielding the full nonlinear prediction

*Y*(nolin). Given the Edgeworth expansions of both the joint and the marginal probability distributions, the conditional expectancy of the variable

*W*given the variable

*U*can be approximated by

*H*( ) are single Hermite polynomials with the standard Gaussian kernel [(A2), (A3)]. The correctional polynomial

_{i}*γ*(

_{U}*x*) at truncation

*l*= −3/2 of the

*X*marginal EDG-PDF is given by

The cumulants *k*^{(3,0)} and *k*^{(4,0)} are, respectively, the skewness and the kurtosis (relative to that of the normal distribution) of the probability distribution of *X* = *U* [see (A4)]. Equations (27) and (28) provide an easy nonlinear downscaling relationship of a *Y* from *X.*

#### 2) Maximum entropy method

*I*

_{ng}(

*X*,

*Y*) [(20)], are estimated using the maximum entropies

*H*(

_{M}*U*=

*X, W*=

*Y*),

_{r}*H*(

_{M}*X*),

*H*(

_{M}*Y*), constrained under the set of known cumulants or, equivalently, the involved expectancies, following the maximum entropy principle of Shannon (1948) and Jaynes (1982). Let us consider the constraints on

*N*expectancies, up to fourth joint order,

_{c}*H*(

_{M}*U*,

*W*) of the ME-PDF

*ρ*

_{M(U,W)}(

*u*,

*w*), satisfying (30), is the solution of the unconstrained minimization problem

*λ*

_{1}, . . . ,

*λ*. The function Γ is given by

_{Nc}The support set of the ME-PDF is the (*U*, *W*) domain *D*.

*D*= [−

*L*,

*L*] ⊗ [−

*L*,

*L*] in (

*U*,

*W*) and then increasing

*L*in order to asymptotically reach the maximum entropy

*H*(

_{M}*U*,

*W*) for

*L*= ∞. By using the Leibnitz derivation rule, it is easy to obtain the derivative of

*H*(

_{M}*U*,

*W*) with respect to

*L*,

*δD*of

*D*. The bounding of (33) when |

*u*|, |

*w*| → ∝ leads to the scaling of the logarithm of the ME-PDF as

*O*(−

*L*), with

^{m}*m*taking the largest exponents

*p*,

_{i}*q*in (30). Therefore, given the range of moments [(30)], we can extend the limits of

_{i}*D*sufficiently further so as to get negligible bound effects on the entropy and the ME-PDF. Furthermore, in order to get integrands of the order exp[

*O*(1)] during the optimization process, we solve the ME problem for the scaled variables (

*U*/

*L*,

*W*/

*L*) in the square

*S*= [−1, 1] ⊗ [−1, 1] taking the appropriate scaled constraints. Afterward, we apply the scaling entropy relationship

The minimization problem is solved by the quasi-Newton method starting at *λ _{i}* = 0 for all

*i*. The integrals giving the function Γ and its

*λ*derivatives are approximated by the bivariate Gauss truncation rule with

*N*weighting factors in the interval [−1, 1]. To get full resolution during the minimization, and to avoid not a number (NAN) and infinite (INF) numbers in computation, we subtract the polynomials in the arguments of exponentials by the correspondent maximum in

_{p}*D*. Finally, the function Γ is multiplied by a sufficiently high factor

*F*in order to emphasize the gradient.

Considering the range of constraint moments in our cases, several experiments have led to the reasonable values of *L* = 10, *N _{p}* = 50,

*F*= 1000, for which convergence is obtained after ∼60 optimization iterates for an accuracy of 10

^{−6}of the gradient of Γ. Larger values of

*L*require larger values of

*N*

_{p}_{,}thus decreasing the convergence rate.

## 3. Dataset and processing

### a. Data

*X*to be the standardized (zero average and unit variance) NAO index given by the first principal component of the detrended SLP monthly data over the NAE in DJF, over the above-mentioned period. This PC-based NAO index (

*X*) is related to more traditional indexes based on SLP differences (Osborn et al. 1999). The PDF of the PC-based NAO index is slightly platykurtic [kur(

*X*) =

*E*(

*X*′

^{3}) − 3 = −0.7, where kur indicates the kurtosis] and bimodal with the presence of two regimes: NAO+ and NAO−. The grid-point-standardized monthly detrended precipitation is assigned to the variable

*Y*. The distribution of monthly DJF precipitation over oceanic areas is closer to the Gaussian than the one over land. It is positively skewed and some locations exhibit outliers with large positive anomalies, with high kurtosis values, especially over Greenland [kur(

*Y*) ≈ 5] Canadian Arctic [kur(

*Y*) ≈ 5] and some deserted areas over North Africa [kur(

*Y*) ≥ 10]. Those extreme precipitation values inflate joint (

*X*,

*Y*) cumulants, thus rendering inapplicable the Edgeworth formalism of estimating mutual information and non-Gaussianity. To obtain, in general, smaller cumulants, we apply the Gaussian anamorphosis both to

*X*and

*Y*. For that purpose, in practice, we start by sorting data within

*X*in ascending order. The

*k*th value of the Gaussian variable

*X*of

_{g}*X*in ascending order is given by

*N*= 53 × 3 is the total number of months in the sample, and Φ

^{−1}( ) is defined as in (2). The same procedure applies to

*Y.*The transformation [(36)] assumes that data uniformly cover the true probability distribution, and consequently it suffers in practice from sampling errors.

To illustrate the relevance of both non-Gaussianity and asymmetry of the precipitation response to NAO, we choose the following six grid points: 1) central Atlantic (ATL; 37.5°N, 25°W); 2) northwest Scotland (SCO; 60°N, 12.5°W); 3) Balearic Islands (BAL; 37.5°N, 2.5°E); 4) Greenland (GRE; 62.5°N, 45°W); 5) east United States (EUS; 40°N, 60°W); and 6) Russia (RUS; 62.5°N, 22.5°E).

### b. Statistical tests

*N*

_{df}temporal degrees of freedom. The estimated

*N*

_{df}of the pair (monthly NAO index

*X*, monthly precipitation

*Y)*uses the 1- and 2-month-lag

*X*,

*Y*auto correlations over the DJF period (Livezey and Chen 1983),

The *N*_{df} is fairly uniform over NAE and close to its spatial average *N*_{df} ∼ 0.95*N* = 151, both for Gaussianized and original variables. In the MCR version we consider 300 different proxy NAO series by randomly permuting the 53 analyzed years while keeping the DJF monthly sequence. Then the statistical tests of the randomized NAO and precipitation are computed in each of 833 NAE grid points and then collected altogether in an ensemble of 833 × 300 realizations. Both for MCG and MCR, the values of the statistical tests are sorted so as to compute quantiles giving the 90%, 95%, and 99% significance level intervals (summarized in Table 1) of rejection of the null hypothesis *H _{o}* of (

*X*,

*Y*) independence. Rejection of

*H*is easier for test-side correlations than

_{o}*c*because they deal with half of the data, below or above the median of

*X*. The thresholds of

*H*rejection for

_{o}*I*

_{ng(ME)}are slightly larger than those for

*I*

_{ng(EF)}(Table 1). The confidence regions of the information correlation are obtained from

*I*

_{ng(ME)}and

*I*. Thresholds for

_{g}*k*

^{(2,1)}and

*k*

^{(3,1)}are also computed. The MCR tests are more appropriate than the MCG tests because they preserve the marginal distribution of working variables and their spatial correlations. This leads to less conservative criteria of

*H*rejection in MCR, especially for the non-Gaussianity measures

_{0}*I*

_{ng(ME)},

*I*

_{ng(EF)}, and

*J*In all statistical maps we shade the 90% MCR statistical significant regions. The fraction of statistically significant area (FS) must be larger than that occurring by mere chance (10%). The values of FS for MCR and MCG are denoted, respectively, as FS-MCR and FS-MCG.

_{c}.## 4. Results

### a. Gaussian and asymmetric correlations

The correlation maps over NAE of the global correlation *c*, the Gaussian correlation *c _{g}*, and the rank correlation (not shown) between

*X*and

*Y*are rather similar over NAE. They only differ by no more than approximately ±10% at certain regions. The map of

*c*shown in Fig. 1a is mainly related to the northward (southward) shift of synoptic storm tracks in the NAO+ (NAO−), thus producing positive (negative) precipitation anomalies in the range 50°N–70°N, east of 40°W, and negative (positive) precipitation anomalies in the range 35°N–45°N, east of 60°W (Hurrell 1995). Other negative correlation regions are the southeastern part of Greenland, the Canadian Arctic, and Labrador, Canada. There are some differences between the map of

_{g}*c*and those of test-side correlations of Gaussian data, thus revealing asymmetry. This means that the response of precipitation to NAO is asymmetric and thus non-Gaussian. The gross contribution of

_{g}*c*(Fig. 1a) is due to the test central correlation

_{g}*t*(

_{M}*X*,

_{g}*Y*) (Fig. 1b), which comes from the alignment of NAO+ and NAO− centroids. Consequently, these maps are rather similar.

_{g}The differences between *t*_{−}(*X _{g}*,

*Y*) and

_{g}*t*

_{+}(

*X*,

_{g}*Y*) reveal differences in sensitivities to the NAO+ and NAO− regimes. This is highlighted in Figs. 2a–f, which show for the six target locations 1) the distribution of the (

_{g}*X*,

_{g}*Y*) data, 2) the contours of the joint PDF obtained with the expansion [(A1)], 3) the linear and nonlinear prediction of

_{g}*Y*[(27)], and 4) the smoothed graphics of the

_{g}*X*conditional mean square error (MSE) of the linear

_{g}*Y*(lin) and nonlinear

_{g}*Y*(nolin) [(27)] predictions, obtained in full cross-validation mode over the

_{g}*N*= 159 data.

For all six cases, the largest data concentration is seen near the origin (*X _{g}* = 0,

*Y*= 0), where PDF contours are elliptic. The farthest contours are deformed because of extreme values. In the distribution, the negative and positive sides (NAO− and NAO+) can be quite different. Near the central Atlantic, where a large area of strong negative correlations is seen, the sensitivity of precipitation to the NAO index is negatively stronger in the NAO− (wetter) regime than in the NAO+ (drier) regime. This is in accordance with the values of

_{g}*t*

_{+}= −0.30 and

*t*

_{−}= −0.64 for the ATL point (see Table 2). Therefore, as seen in Fig. 2a, those data and the PDF spread over a larger domain in the NAO+ regime than in the NAO− regime.

The improvement due the nonlinear predictions is visible from the reduction of the cross-validated MSE of the NAO-downscaled precipitation, especially for the negative precipitation anomalous values (top of Fig. 2a and Table 2).

The larger sensitivity of precipitation in the wetter NAO regime is also verified north of Scotland, near the location of most positive correlation extremes, in accordance with *t*_{−}(*X _{g}*,

*Y*) = 0.43 and

_{g}*t*

_{+}(

*X*,

_{g}*Y*) = 0.82 for the SCO point (see Fig. 2e and Table 2). This behavior occurs because near the average storm tracking of NAO− and NAO+ regimes, the sensitivity of the precipitation response must be enhanced. The higher the sensitivity, the closer the phenomenon, that is, a local source produces higher sensitivity than remote sources.

_{g}There are regions where the precipitation sensitivity is nearly restricted to one of the regimes, that is, where only one of the test-side correlations (*t*_{−} or *t*_{+}) is significantly different from zero. In particular, the negative test-side correlation is particularly strong over the Ukraine, Romania, and former Yugoslavia (*t*_{−} ≈ −0.6), whereas the corresponding positive one is quite small (*t*_{+} ≈ −0.2). In the Mediterranean region, south of 40°N, the positive test-side correlation *t*_{+} is negative, whereas the corresponding one on the negative side practically vanishes. This means that, in the south Mediterranean, while in the NAO− regime the statistical mean response of precipitation to NAO is not significant, in the NAO+ regime it is favorable to strong extreme drought events. This is consistent with the values of *t*_{−} = 0.19 and *t*_{+} = −0.65 for the BAL point. This is also apparent from the shape of contours (Fig. 2b) and from the nonlinear prediction graphic. This situation occurs especially on the second half of the analyzed time interval (i.e., 1978–2003), and may have a connection with the increasing desertification over Mediterranean regions during that same period. These results agree with the positive NAO trend over the last two decades (Hurrell et al. 2004), with higher positive extremes in the corresponding index.

In south Greenland and Baffin Bay the driest conditions are again especially favored in the NAO+ regime, whereas the NAO index in the negative regime practically does not have any average statistical influence on precipitation. Consistent values of *t*_{−} = 0.14 and *t*_{+} = −0.44 are given at the GRE point. This is visible from the completely different shape of PDF contours in the positive and negative regimes (Fig. 2d). This may be due to a nonlinear influence of NAO on the systems influencing the precipitation in Greenland, which must be synoptically analyzed in further studies. The precipitation in Greenland is also correlated with the presence of other regimes, such as the Greenland–Scandinavian regime, influencing the strength of the Icelandic low (Serreze et al. 1997). We have studied another particular situation where *t*_{−} and *t*_{+} have the same signal, opposite to that of *t _{M}.* This holds at the EUS point (Fig. 2c) with

*t*

_{−}= 0.24,

*t*

_{+}= 0.35, and

*t*= −0.28. Unlike other points of strong correlation, the (

_{M}*X*,

_{g}*Y*distribution in the RUS point is rather close to bi-Gaussianity, as is clear from Fig. 2f and Table 2. At this point

_{g})*c*= 0.69.

_{g}The asymmetry measure *J _{c}*(

*X*,

_{g}*Y*) for Gaussian data is presented in Fig. 3a. The largest values are significant at ∼95% significance level (see Table 1). Some coherent regions of significant

_{g}*J*are visible in the map over the Mediterranean and the central Atlantic, and near 40°N, South Greenland, and the Canadian Arctic coast.

_{c}### b. Cumulants and Edgeworth PDF applicability

The contribution to non-Gaussianity comes from high-order cumulants, easily expressed in terms of nonlinear correlations, leading to nonzero cumulant terms in the Edgeworth expansion of the joint PDF [(A1)] and the negentropy [(26)]. We show maps of the main cumulant terms of the Gaussian variables *k*^{(2,1)}, *k*^{(3,1)} (Figs. 3b–3c) intervening the most in (A1) and (26). They are proportional to the nonlinear correlations cor(*X*^{2}_{g}, *Y _{r}*) and cor(

*X*

^{3}

_{g},

*Y*) respectively. The other cumulants—

_{r}*k*

^{(1,2)},

*k*

^{(1,3)}, and

*k*

^{(2,2)}—are only residual (not shown).

The first and second mentioned correlations express the correlation between the residues of the precipitation linear prediction and, respectively, the squares (*X*^{2}_{g}) and cubes (*X*^{3}_{g}) of the Gaussian NAO index *X _{g}*. These correlations express, respectively, how a quadratic or a cubic function of NAO fits those residues. The cumulant

*k*

^{(2,1)}is dominant over Greenland, the Mediterranean, and the Atlantic Central Basin. The corresponding spatial dependence is closely related to the asymmetry test

*J*map (Fig. 3a), with a map correlation of 0.93 over the NAE. This is explained because the nonlinear correlations cor(

_{c}*X*

^{2}

_{g},

*Y*) and cor(|

_{r}*X*|,

_{g}*Y*), related with

_{r}*J*(see section 2b), behave in the same way.

_{c}The cumulants *k*^{(3,1)} and *k*^{(2,1)} also contribute to the nonlinear prediction [(28)]. Note, in particular for the central Atlantic, that the stronger correlation on the negative side (NAO−) is consistent with a fitting predictive curve formed from a negative slope straight line [resulting from the negative correlation *c*(*X _{g}*,

*Y*)], a positive concavity elliptic curve [resulting from the positive cor(

_{g}*X*

^{2}

_{g},

*Y*), Fig. 3b] and cubic curve dependent on

_{r}*X*

^{3}

_{g}[resulting from the positive cor(

*X*

^{3}

_{g}

*, Y*), Fig. 3c]. To verify how the EDG-PDF differs from a probability density, we compute

_{r}*P*

_{neg}(Fig. 3d). The maxima of |

*P*

_{neg}| are reached in the central Atlantic region (∼0.014) and south of Iceland (∼0.008). These small values constitute a necessary albeit not a sufficient condition for the EDF-PDF to be a good representation of the real PDF.

### c. Mutual information

The map of the Gaussian MI *I _{g}*(

*X*,

_{g}*Y*) (Fig. 4a) is obtained from that of

_{g}*c*, with two cores of maxima reaching 0.4 nats near 60°N, 10°W, and 0.30 nats near 35°N, 15°W, where nat is the MI unit when natural logarithms are used. The non-Gaussian MI is computed using the following three proposed estimators: the maximum entropy estimator

_{g}*I*

_{ng(ME)}(Fig. 4b), the Edgeworth estimator of (26)

*I*

_{ng(EF)}(Fig. 4c), and that obtained with the integral (23),

*I*

_{ng(MI)}(practically identical to that of Fig. 4c). There are some regions with

*I*

_{ng(ME)}(

*X*,

_{g}*Y*) above the corresponding 95% significance level (0.039 nats). These regions are the central and west Atlantic,

_{g}*I*

_{ng}≈ 0.06 nats, southeast Iceland, and south Greenland, where

*I*

_{ng}reaches maxima of ∼0.06 nats. The value of

*I*

_{ng}over the Mediterranean, approximately 0.02–0.04 nats, appears to be significant at the 80% level. There are also some regions of non-Gaussianity in central Europe (Fig. 4b) and around 42°N, 48°W, with

*I*

_{ng}≈ 0.04 nats. As expected,

*I*

_{ng}and the asymmetry measure

*J*, share some common regions, because the asymmetry contributes to non-Gaussianity. Contrary to the EDG estimator, the ME estimator has no fundamental limitations as far as the size of cumulants is concerned. To assess the effect of the Taylor approximation of the logarithm of the EDG-PDF [(24)], we have sorted, in ascending order, all the values of the integral value

_{c}*I*

_{ng(MI)}in the 833 grid points of the NAE and plotted them against the correspondent values of

*I*

_{ng(EF)}(Fig. 5a). Both estimators agree within the error of the Gaussian quadrature (∼10

^{−4}nats) up to

*I*

_{ng}≈ 0.02 nats, followed by a slight overestimation by the Edgeworth equation [(26)] of MI [

*I*

_{ng(EF)}]. The map correlation between the two estimators is 0.98. By comparing the sorted values of

*I*

_{ng(ME)}with the correspondent

*I*

_{ng(EI)}values (Fig. 5b), we have an idea about the effect of Edgeworth truncation error in (23). Their correlation is 0.76, while the correlation with

*I*

_{ng(EF)}is expectedly lower: 0.69. By collecting all the 833 values, the

*I*

_{ng(EI)}estimator exhibits a negative increasing bias in comparison with the maximum entropy estimator, especially for

*I*

_{ng(ME)}

*>*0.01 nats. This justifies the missing of some “non-Gaussian” regions in central Europe in the map of

*I*

_{ng(EF)}(Fig. 4c). The central Atlantic, south Iceland, and Greenland non-Gaussian regions are retrieved by

*I*

_{ng(EF)}, where excessively “spiky” values occur and

*P*

_{neg}reaches the maximum values. Overall, the

*I*

_{ng(ME)}is a smoother field than that of

*I*

_{ng(EF)}. As far as non-Gaussianity maps are concerned, the fraction of statistically significant area (FS) is slightly larger than that obtained by mere chance (10%), contrary to FS for the correlation maps. This is a consequence of the intrinsically low non-Gaussianity values resulting from temporal averaging.

### d. The effect of Gaussian anamorphosis

Near the centroids of maximum Gaussian correlation, 40°N, 20°W and 60°N, 10°W, we have verified that the side correlation is more intense in the wet NAO phase than in the dry NAO phase. If no Gaussian anamorphosis (GA) is performed, there is an enhancement of the side correlation over the positively skewed half of the precipitation PDF in comparison with the marginally Gaussian variables. This can be seen through the larger intensity of the asymmetry measure *J _{c}* (Fig. 6a) as compared with the Gaussian case (Fig. 3a). The GA is a nonlinear transformation that can either increase or decrease the absolute correlation |

*c*| between variables, and thus the Gaussian MI. For example, GA, when performed with appropriate mixtures of two bivariate Gaussian PDFs (one with zero correlation and the other with correlation ∼1), can convert a nearly zero correlation into a nearly 100% correlation and vice versa. In our case, the change of correlation resulting from GA is not very high (maximum of ∼10%). However, given that the derivative of the Gaussian MI (

*I*) grows from zero to infinity when the absolute correlation |

_{g}*c*| tends to one, a small change of |

*c*| can make a large difference in

*I*. The

_{g}*I*difference between original and Gaussian data is plotted in Fig. 6b with a positive maximum of ∼0.05 nats near 40°N, 20°W where |

_{g}*c*| grows from ∼0.64 to ∼0.69. Comparing with the case of marginally Gaussian data, the

*I*

_{ng}over the Mediterranean is preserved in the original data (Fig. 6c) and another large area appears around the White Sea and Kola Peninsula at approximately 60°N, 35°E.

The invariance of MI, for the original and Gaussian data (16), holds for the ME estimator with an accuracy of ∼0.01 nats. Beyond the numerical accuracy, this is also because cumulants of order larger than 4, not taken into account in the ME estimator of MI, have a slightly different effect between the original and Gaussian data. The invariance [(16)] of MI is hardly verified for the estimator *I*_{ng(EF)}, except for low values of the joint negentropy. The original highly skewed precipitation values render the EDF-PDF inapplicable for much of the NAE because of the much higher values of |*P*_{neg}| (Fig. 6d) (above 0.01), as compared with those of “Gaussinized” variables (Fig. 3d).

## 5. Discussion and conclusions

An asymmetry measure of bivariate probability distributions is built. This method computes the conditional correlations for each half of the sorted data, denoted in the paper as side correlations. The comparison between side correlations and the global correlation, under the hypothesis of bivariate Gaussianity, led us to the construction of several robust tests of PDF asymmetry. Asymmetry contributes to non-Gaussianity, giving extra information beyond that given by a Gaussian PDF. Consequently, we aimed at evaluating MI and their Gaussian and non-Gaussian counterparts. Two estimators thereof are proposed. The first estimator is based on the Edgeworth expansion of the joint PDF and MI in terms of cumulants and Hermite polynomials, applicable only for sufficiently small values of the joint negentropy and cumulants. The second method uses the statistical entropies estimated by the ME principle, thus being applicable even for large deviations from Gaussianity. Non-Gaussian MI is computed for two sets of variables: the original ones and those for which Gaussian anamorphosis is applied onto the original variables in order to mitigate the effect of marginal outliers in the cumulants and thus render the Edgeworth method applicable in a much better way.

The methods have been applied to the PC-based NAO index as a large-scale predictor variable and to the gridpoint monthly DJF precipitation over the NAE as downscaled predictand variables.

From numerical computations we have verified that the response of monthly precipitation to NAO is asymmetric and non-Gaussian. Maps of both test-side correlations for NAO− and NAO+ regimes show coherent regions and consistent regional differences, thus highlighting the asymmetric precipitation response to NAO. Within the main extreme correlation centers between NAO and precipitation, the side correlation is stronger in the wetter NAO regime and is enhanced if no Gaussian anamorphosis is performed over the precipitation field. In other regions such as the Mediterranean or Greenland, the sensitivity of precipitation in the NAO− regime practically vanishes, whereas the correlation in the drier regime (NAO+) is significantly negative.

The MI provides a general measure of statistical redundancy. As far as the untransformed variables (NAO, precipitation) are concerned, the larger bulk of MI comes from its Gaussian part, with some exceptions over areas such as Greenland where the non-Gaussian part of MI and nonlinearity are relevant. The non-Gaussian MI is relevant in some areas such as the Mediterranean, the southern part of Greenland, the southeast of Iceland, the area around 42°N, 48°W and regions around the White Sea and Kola Peninsula. For Gaussian data, a maximum of non-Gaussian MI appears in the Central Atlantic Basin resulting from local decrement of the global correlation produced by the Gaussian anamorphosis.

Both the ME and EDG methods can be generalized to multivariate data providing estimates of the joint and conditional PDF and moments. Extensions using higher-order cumulants are also possible. However, in order to avoid overfitting, it is preferable that the PDF calibration and validation be made in cross-validation mode.

The maximum entropy method for computing MI holds without restrictions and may be a useful tool for analyzing non-Gaussianity of climatic data. The Edgeworth method is able to give an indication of non-Gaussianity if variables are previously constrained to significantly reduce the magnitude of cumulants.

## Acknowledgments

This research was developed at CGUL with support from the Portuguese Science Foundation under the project PREDATOR—POCTI/CTE-ATM/62475/2004, co-financed by the European Union under program FEDER. Thanks are due to an anonymous referee, Timothy DelSole, Dinis Pestana, Aapo Hyvärinen, Marc Hulle, and Ricardo Trigo, for their constructive comments and criticisms.

## REFERENCES

Abdallah, A. S., and M. D. Plumbey, 2003: Geometric ICA using nonlinear correlations and MDS.

*Proc. Fourth Int. Symp. on Independent Component Analysis and Blind Signal Separation,*Nara, Japan, Brain Science Institute, 161–166.Abramov, R., 2006: A practical computational framework for the multidimensional moment-constrained maximum entropy principle.

,*J. Comput. Phys.***211****,**198–209.Abramowitz, M., and I. A. Stegun, Eds. 1972:

*Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables*. Dover, 1046 pp.Aires, W., B. Rossow, and A. Chedin, 2002: Rotation of EOFs by the independent component analysis: Towards a solution of the mixing problem in the decomposition of geophysical time series.

,*J. Atmos. Sci.***59****,**111–123.Barndorff-Nielsen, O. E., and D. R. Cox, 1989:

*Asymptotic Techniques for use in Statistics*. Chapman & Hall, 252 pp.Biau, G., E. Zorita, H. von Storch, and H. Wackernagel, 1999: Estimation of precipitation by kriging in the EOF space of the sea level pressure field.

,*J. Climate***12****,**1070–1085.Cassou, C., T. Laurent, J. W. Hurrell, and C. Deser, 2004: North Atlantic winter climate regimes: Spatial asymmetry, stationarity with time, and oceanic forcing.

,*J. Climate***17****,**1055–1068.Cavazos, T., 2000: Using self-organizing maps to investigate extreme climate events: An application to wintertime precipitation in the Balkans.

,*J. Climate***13****,**1718–1732.Comon, P., 1994: Independent component analysis, a new concept?

,*Signal Process.***36****,**287–314.Cover, T. M., and J. A. Thomas, 1991:

*Elements of Information Theory*. Wiley, 576 pp.Daniels, H. E., 1954: Saddlepoint approximations in statistics.

,*Ann. Math. Statist.***25****,**631–650.DelSole, T., 2004: Predictability and information theory. Part I: Measures of predictability.

,*J. Atmos. Sci.***61****,**2425–2440.Dole, R. M., 1983: Persistent anomalies of the extratropical Northern Hemisphere wintertime circulation.

*Large-Scale Dynamical Processes in the Atmosphere,*B. J. Hoskins and R. P. Pearce, Eds., Academic Press, 95–109.Edgeworth, F. Y., 1905: The law of error.

,*Cambridge Philos. Soc.***20****,**36–66. 113–141.Hurrell, J. W., 1995: Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation.

,*Science***269****,**377–387.Hurrell, J. W., M. P. Hoerling, A. S. Phillips, and T. Xu, 2004: Twentieth century North Atlantic climate change. Part I: Assessing determinism.

,*Climate Dyn.***23****,**371–389.Jaynes, E. T., 1982: On the rationale of maximum-entropy methods.

,*Proc. IEEE***70****,**939–952.Jolliffe, I. T., and D. B. Stephenson, 2003:

*Forecast Verification—A Practicioner’s Guide in Atmospheric Science*. Wiley, 254 pp.Kenney, J. F., and E. S. Keeping, 1951:

*Mathematics of Statistics, Part 2*. 2d ed. Van Nostrand, 202 pp.Kistler, R., and Coauthors, 2001: The NCEP–NCAR 50-Year Reanalysis: Monthly means CD-ROM and documentation.

,*Bull. Amer. Meteor. Soc.***82****,**247–268.Kraskov, A., H. Stögbauer, and P. Grassberger, 2004: Estimating mutual information.

,*Phys. Rev. E***69****.**066 138, doi:10.1103/PhysRevE.69.066138.Livezey, R. E., and W. Y. Chen, 1983: Statistical field significance and its determination by Monte-Carlo techniques.

,*Mon. Wea. Rev.***111****,**46–59.Marwan, N., and J. Kurths, 2002: Nonlinear analysis of bivariate data with cross recurrence plots.

,*Phys. Lett. A***302****,**299–307.Osborn, T. J., K. R. Briffa, S. F. B. Tett, P. D. Jones, and R. M. Trigo, 1999: Evaluation of the North Atlantic Oscillation as simulated by a climate model.

,*Climate Dyn.***15****,**685–702.Palmer, T. N., 1999: A nonlinear dynamical perspective on climate prediction.

,*J. Climate***12****,**575–591.Paninski, L., 2003: Estimation of entropy and mutual information.

,*Neural Comput.***15****,**1191–1254.Paninski, L., 2004: Estimating entropy on m bins given fewer than m samples.

,*IEEE Trans. Info. Theory***50****,**2200–2203.Papoulis, A., 1991:

*Probability, Random Variables, and Stochastic Processes*. McGraw-Hill, 666 pp.Pozo-Vásquez, D., M. J. Esteban-Parra, F. S. Rodrigo, and Y. Castro-Díez, 2001: A study of NAO variability and its possible non-linear influences on European surface temperature.

,*Climate Dyn.***17****,**701–715.Robinson, W. A., S. Li, and S. Peng, 2003: Dynamical nonlinearity in the atmospheric response to Atlantic sea surface temperature anomalies.

,*Geophys. Res. Lett.***30****.**2038, doi:10.1029/2003GL018416.Rockinger, M., and E. Jondeau, 2002: Entropy densities with an application to autoregressive conditional skewness and kurtosis.

,*J. Econ.***106****,**119–142.Rogers, J. C., 1997: North Atlantic storm track variability and its association to the North Atlantic Oscillation and climate variability of northern Europe.

,*J. Climate***10****,**1635–1647.Roulston, M. S., and L. A. Smith, 2002: Evaluating probabilistic forecasts using information theory.

,*Mon. Wea. Rev.***130****,**1653–1660.Serreze, M. C., F. Carse, and R. G. Barry, 1997: Icelandic low cyclone activity: Climatological features, linkages with the NAO, and relationships with recent changes in the Northern Hemisphere circulation.

,*J. Climate***10****,**453–464.Shannon, C. E., 1948: The mathematical theory of communication.

,*Bell Syst. Technol. J.***27****,**379–423.Silverman, B. W., 1986:

*Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied Probability*. Chapman and Hall, 175 pp.Sivia, D. S., 1996:

*Data Analysis: A Bayesian Tutorial*. Oxford University Press, 200 pp.Trigo, R. M., and J. P. Palutikof, 1999: Simulation of daily temperatures for climate change scenarios over Portugal: A neural network model approach.

,*Climate Res.***6****,**1161–1171.Wilks, D. S., 1995:

*Statistical Methods in the Atmospheric Sciences—An Introduction*. Academic Press, 467 pp.Wu, A., and W. W. Hsieh, 2004: The nonlinear association between ENSO and the Euro-Atlantic winter sea level pressure.

,*Climate Dyn.***23****,**859–868.

## APPENDIX

### Edgeworth Expansion of a Bivariate Probability Density Function

*ρ*

_{U,W}(

*u*,

*w*) of two standardized uncorrelated variables

*U*,

*W*as expressed in (21), each of which are assumed as averages of

*n*

_{eq}iid variables. Following Barndorff-Nielsen and Cox (1989),

*υ*(

_{U,W}*u*,

*w*), is a function of the higher-than-second-order moments given by

*l*= 3/2. The terms

*H*

_{(}

_{p}_{)}(

*u*) and

*H*

_{(}

_{q}_{)}(

*w*) are the Hermite polynomials, given by the following recurring relationship (Abramowitz and Stegun 1972):

*H*

_{(}

_{p}_{)}(

*u*) and

*H*

_{(}

_{q}_{)}(

*w*) in (A1) are expressed as products of the (

*p*+

*q*)-order cumulants

*k*

^{(p,q)}of the (

*U, W*) distribution (Kenney and Keeping 1951). When dealing with uncorrelated variables

*U*and

*W*of zero mean and unit variance, as it is the case in this paper, the cumulants used for the chosen truncation assume the following rather simple expressions:

Under Gaussian conditions, all the cumulants of order (*p* + *q*) equal to or greater than 3 will vanish. Cumulants with *p* ≠ *0* and *q* ≠ *0* can easily be expressed in terms of nonlinear correlations between *U* and *W*.

Composed graphics for the six selected points: (a) ATL, (b) BAL, (c) EUS, (d) GRE, (e) SCO, and (f) RUS. Each graphic contains 1) time series of the Gaussian precipitation (*Y _{g}*) against the Gaussian NAO index (

*X*; filled circles for 1951–77, open circles for 1978–2003); 2) contours of the corresponding joint PDF; and 3) linear and nonlinear prediction of

_{g}*Y*. (top of each panel) Smoothed graphics of the conditional RMSE of the linear (thin curve) and nonlinear prediction (thick curve) of

_{g}*Y*from

_{g}*X*are also shown.

_{g}Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

Composed graphics for the six selected points: (a) ATL, (b) BAL, (c) EUS, (d) GRE, (e) SCO, and (f) RUS. Each graphic contains 1) time series of the Gaussian precipitation (*Y _{g}*) against the Gaussian NAO index (

*X*; filled circles for 1951–77, open circles for 1978–2003); 2) contours of the corresponding joint PDF; and 3) linear and nonlinear prediction of

_{g}*Y*. (top of each panel) Smoothed graphics of the conditional RMSE of the linear (thin curve) and nonlinear prediction (thick curve) of

_{g}*Y*from

_{g}*X*are also shown.

_{g}Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

Composed graphics for the six selected points: (a) ATL, (b) BAL, (c) EUS, (d) GRE, (e) SCO, and (f) RUS. Each graphic contains 1) time series of the Gaussian precipitation (*Y _{g}*) against the Gaussian NAO index (

*X*; filled circles for 1951–77, open circles for 1978–2003); 2) contours of the corresponding joint PDF; and 3) linear and nonlinear prediction of

_{g}*Y*. (top of each panel) Smoothed graphics of the conditional RMSE of the linear (thin curve) and nonlinear prediction (thick curve) of

_{g}*Y*from

_{g}*X*are also shown.

_{g}Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry *J _{c}* (CI = 0.02; SR at

*α*= 90% shaded), (b) map of the cumulant

*k*

^{(2,1)}(CI = 0.1; SR at

*α*= 90% shaded), (c) map of the cumulant

*k*

^{(3,1)}(CI = 0.1; SR at

*α*= 90% shaded), and (d) map of

*P*

_{neg}(CI = 0.002). All quantities computed for are subject to Gaussian anamorphosis. The 90% significant area fractions FS-MCR are 0.17, 0.17, and 0.19 for (a), (b), and (c), respectively. Corresponding values of FS-MCG are 0.16, 0.13, and 0.15.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry *J _{c}* (CI = 0.02; SR at

*α*= 90% shaded), (b) map of the cumulant

*k*

^{(2,1)}(CI = 0.1; SR at

*α*= 90% shaded), (c) map of the cumulant

*k*

^{(3,1)}(CI = 0.1; SR at

*α*= 90% shaded), and (d) map of

*P*

_{neg}(CI = 0.002). All quantities computed for are subject to Gaussian anamorphosis. The 90% significant area fractions FS-MCR are 0.17, 0.17, and 0.19 for (a), (b), and (c), respectively. Corresponding values of FS-MCG are 0.16, 0.13, and 0.15.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry *J _{c}* (CI = 0.02; SR at

*α*= 90% shaded), (b) map of the cumulant

*k*

^{(2,1)}(CI = 0.1; SR at

*α*= 90% shaded), (c) map of the cumulant

*k*

^{(3,1)}(CI = 0.1; SR at

*α*= 90% shaded), and (d) map of

*P*

_{neg}(CI = 0.002). All quantities computed for are subject to Gaussian anamorphosis. The 90% significant area fractions FS-MCR are 0.17, 0.17, and 0.19 for (a), (b), and (c), respectively. Corresponding values of FS-MCG are 0.16, 0.13, and 0.15.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the Gaussian mutual information *I _{g}* (CI = 0.1; SR at

*α*= 90% shaded), (b) map of the non-Gaussian MI (maximum entropy estimator) (CI = 0.01; SF at 90% shaded), (c) the same as (b), but for the Edgeworth estimator, and (d) map of the information correlation

*c*

_{inf.}(CI = 0.1; SR at 90% shaded). All quantities computed for Gaussian variables. The 90% significant area fractions FS-MCR are 0.73, 0.23, 0.22, and 0.64 for (a), (b), (c), and (d), respectively. Correspondent values of FS-MCG are 0.73, 0.12, 015, and 0.57.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the Gaussian mutual information *I _{g}* (CI = 0.1; SR at

*α*= 90% shaded), (b) map of the non-Gaussian MI (maximum entropy estimator) (CI = 0.01; SF at 90% shaded), (c) the same as (b), but for the Edgeworth estimator, and (d) map of the information correlation

*c*

_{inf.}(CI = 0.1; SR at 90% shaded). All quantities computed for Gaussian variables. The 90% significant area fractions FS-MCR are 0.73, 0.23, 0.22, and 0.64 for (a), (b), (c), and (d), respectively. Correspondent values of FS-MCG are 0.73, 0.12, 015, and 0.57.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the Gaussian mutual information *I _{g}* (CI = 0.1; SR at

*α*= 90% shaded), (b) map of the non-Gaussian MI (maximum entropy estimator) (CI = 0.01; SF at 90% shaded), (c) the same as (b), but for the Edgeworth estimator, and (d) map of the information correlation

*c*

_{inf.}(CI = 0.1; SR at 90% shaded). All quantities computed for Gaussian variables. The 90% significant area fractions FS-MCR are 0.73, 0.23, 0.22, and 0.64 for (a), (b), (c), and (d), respectively. Correspondent values of FS-MCG are 0.73, 0.12, 015, and 0.57.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Non-Gaussian MI *I*_{ng(EF)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(EI)}, and (b) non-Gaussian MI *I*_{ng(EI)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(ME)}.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Non-Gaussian MI *I*_{ng(EF)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(EI)}, and (b) non-Gaussian MI *I*_{ng(EI)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(ME)}.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Non-Gaussian MI *I*_{ng(EF)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(EI)}, and (b) non-Gaussian MI *I*_{ng(EI)} estimator as function of the ascending order sorted values of non-Gaussian MI *I*_{ng(ME)}.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry test *J _{c}* for original data (CI = 0.02; SR at

*α*= 90% shaded), (b) Gaussian MI difference between Gaussian and original data, (c) map of the non-Gaussian MI

*I*

_{ng(ME)}(CI = 0.01, SF at 90% shaded) for original data, and (d) map of

*P*

_{neg}for original data. The 90% significant area fractions FS-MCR are 0.17 and 0.24 for (a) and (c), respectively. Correspondent values of FS-MCG are 0.13 and 0.14.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry test *J _{c}* for original data (CI = 0.02; SR at

*α*= 90% shaded), (b) Gaussian MI difference between Gaussian and original data, (c) map of the non-Gaussian MI

*I*

_{ng(ME)}(CI = 0.01, SF at 90% shaded) for original data, and (d) map of

*P*

_{neg}for original data. The 90% significant area fractions FS-MCR are 0.17 and 0.24 for (a) and (c), respectively. Correspondent values of FS-MCG are 0.13 and 0.14.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

(a) Map of the asymmetry test *J _{c}* for original data (CI = 0.02; SR at

*α*= 90% shaded), (b) Gaussian MI difference between Gaussian and original data, (c) map of the non-Gaussian MI

*I*

_{ng(ME)}(CI = 0.01, SF at 90% shaded) for original data, and (d) map of

*P*

_{neg}for original data. The 90% significant area fractions FS-MCR are 0.17 and 0.24 for (a) and (c), respectively. Correspondent values of FS-MCG are 0.13 and 0.14.

Citation: Monthly Weather Review 135, 2; 10.1175/MWR3407.1

Rejection intervals of the null hypothesis at *α* level of significance, using the MCG, MCR (Gaussianized data), and MCR (original data) tests from the top to the bottom of each cell, respectively (see text for details).

Values of correlation *c,* test-side correlations (*t _{M}*,

*t*

_{−},

*t*

_{+}), asymmetry test

*J*, Gaussian MI

_{c}*I*and non-Gaussian MI (

_{g}*I*

_{ng(ME)}and

*I*

_{ng(EF)}estimators), information correlation

*c*

_{inf}, integral

*P*

_{neg}(see definition in text), and MSE (

*E*

^{2}) of the linear and nonlinear prediction for Gaussian variables, i.e., subject to Gaussian anamorphosis (G) in the six selected points. The values of

*I*,

_{g}*I*

_{ng(ME)}, and

*J*are also added for the original untransformed data (O).

_{c}