• Boer, G. J., and Coauthors, 1992: An intercomparison of the climates simulated by 14 atmospheric general circulation models. J. Geophys. Res.,97, 12 771–12 786.

  • Bretherton, C. S., C. Smith, and J. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data. J. Climate,5, 541–560.

  • Flury, B., 1988: Common Principal Components and Related Multivariate Models. J. Wiley, 258 pp.

  • ——, and W. Gautschi, 1986: An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. Siam. J. Sci. Statist. Comp.,7, 169–184.

  • Frankignoul, C., S. Fevrier, N. Sennechael, J. Verbeek, and P. Braconnot, 1995: An intercomparison between four tropical ocean models: Thermocline variability. Tellus,47A, 351–364.

  • Gates, W. L., 1992: AMIP: The Atmospheric Model Intercomparison Project. Bull. Amer. Meteor. Soc.,73, 1962–1970.

  • Horn, L. H., and R. A. Bryson, 1960: Harmonic analysis of the annual march of precipitation over the United States. Ann. Assoc. Amer. Geogr.,50, 157–171.

  • Hsu, C.-P., and J. M. Wallace, 1976: The global distribution of the annual and semiannual cycles in precipitation. Mon. Wea. Rev.,104, 1093–1101.

  • IMSL, 1991: IMSL Stat/Library. IMSL Inc., 1578 pp.

  • Lorenz, E. N., 1956: Empirical orthogonal functions and statistical weather prediction. Science Rep. 1, Dept. of Meteorology, MIT, 49 pp. [NTIS AD 110268.].

  • Phillips, T. J., 1994: A summary documentation of the AMIP models. PCMDI Rep. 18, Program for Climate Model Diagnosis and Intercomparison, University of California, Lawrence Livermore National Laboratory, 343 pp. [Available from Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, CA 94551-9900.].

  • Preisendorfer, R. W., 1988: Principal Component Analysis in Meteorology and Oceanography. Vol. 17, Developments in Atmospheric Science, Elsevier, 425 pp.

  • Schemm, J. S., S. Schubert, J. Terry, and S. Bloom, 1992: Estimates of monthly mean soil moisture for 1979–1989. NASA Tech. Memo. 104571, 254 pp. [Available from NASA/Goddard Space Flight Center, Greenbelt, MD 20771.].

  • Spencer, R. W., 1993: Global oceanic precipitation from the MSU during 1979–91 and comparisons to other climatologies. J. Climate,6, 1301–1326.

  • Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc.,74, 2317–2330.

  • View in gallery
    Fig. 1.

    (a) The leading principal vector of the observed precipitation dataset deviations from the 120-month mean. This mode explains 25% of the variance. Contour interval is 0.01. Dashed lines indicate negative values. Solid lines indicate zero and positive values. (b) As in (a) except for the second principal vector. This explains 14% of the variance.

  • View in gallery
    Fig. 2.

    The leading three principal components of the observed precipitation dataset deviations from the 120-month mean.

  • View in gallery
    Fig. 3.

    Percent variance explained for a common PC analysis for the observed precipitation and AMIP model precipitation dataset deviations from the 120-month mean. The components are ordered in the sequence of the observed data.

  • View in gallery
    Fig. 4.

    The leading three principal components of the UCLA AMIP precipitation dataset deviations from the 120-month mean.

  • View in gallery
    Fig. 5.

    (a) The leading common PC for each of the five members of the ECMWF AMIP ensemble of simulations for the 200-hPa interannual variations of the velocity potential. (b) The same data as in (a) except the mean (thick solid line) of the five members is shown along with the deviations of each simulation from this mean.

  • View in gallery
    Fig. 6.

    The Southern Oscillation index (SOI) from the Climate Prediction Center for 1979–88. This index is the difference in sea level pressure measured at Darwin, Australia, and Tahiti. The filtered curve is produced using an eight-point Gaussian smoothing.

  • View in gallery
    Fig. 7.

    The leading common principal vector of the five ECMWF AMIP ensembles for the 200-hPa interannual variations of velocity potential. Contour interval is 0.01. The dashed lines indicate negative values. The solid lines are positive and zero.

  • View in gallery
    Fig. 8.

    (a) As in Fig. 5b except for the second component. (b) As in Fig. 5b except for the third component. Note the scale change on the ordinate.

  • View in gallery
    Fig. 9.

    Percent variance explained for the leading three common principal components of the observed (NCEP reanalysis) and five ECMWF AMIP simulations for the 200-hPa velocity potential.

  • View in gallery
    Fig. 10.

    (a) The leading common principal component of the analysis with the NCEP reanalysis and the five members of the ECMWF AMIP ensemble of simulations for the 200-hPa interannual variations of the velocity potential. The thick solid line is the reanalysis data. (b) As in (a) except for the second component.

  • View in gallery
    Fig. 11.

    Percent variance explained for a common principal component analysis for the difference in the global 200-hPa velocity potential of four AMIP models and the NCEP–NCAR reanalyses. The components are ordered in the sequence of the ECMWF model.

  • View in gallery
    Fig. 12.

    (a) The leading common principal component for each of the four models for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses. (b) As in (a) except for the second CPC component.

  • View in gallery
    Fig. 13.

    (a) The leading common principal vector of the four models for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses 200-hPa interannual variations of velocity potential. Contour interval is 0.01. The dashed lines indicate negative values. The solid lines are positive and zero. (b) As in (a) except for the second CPC vector.

  • View in gallery
    Fig. 14.

    Percent variance explained for a common principal component analysis for the difference in the global 200-hPa velocity potential of the five ECMWF ensemble members and the NCEP–NCAR reanalyses.

  • View in gallery
    Fig. 15.

    (a) The leading common principal component for each of the five ECMWF ensemble members for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses. (b) As in (a) except for the second CPC component.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 176 35 3
PDF Downloads 51 22 1

Using Common Principal Components for Comparing GCM Simulations

Sailes SenguptaProgram for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Livermore, California

Search for other papers by Sailes Sengupta in
Current site
Google Scholar
PubMed
Close
and
James S. BoyleProgram for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Livermore, California

Search for other papers by James S. Boyle in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

The technique of common principal components (CPC) is applied to compare the results of a number of GCM simulations. The data used are the 120 monthly mean fields from 30 Atmospheric Model Intercomparison Project (AMIP) simulations and an ensemble of five AMIP integrations from a single GCM. The spatial grid and 120 time points allows the calculation of up to 31 covariance matrices for input into the CPC analyses.

The CPC methodology is applied to a variety of model comparision problems within the context of the AMIP experiment. The aspects of the simulations used for demonstration are the seasonal cycle of precipitation over the United States, the global 200-hPa velocity potential, and the difference between the 200-hPa divergence of four closely related AMIP models and the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis and a small ensemble of simulations (five) of the European Centre for Medium-Range Weather Forecasts AMIP model.

These analyses demonstrate the utility of the CPC approach in identifying models systematic errors, the reduction of data in ensembles of simulation, and in model parameterization comparisons. The common errors among the models tend to highlight the area in which a gap in knowledge or parameterization implementation exists. In addition CPC analyses provide a more complete statistical picture of an emsemble of simulations within a single model than the traditional means and variances. It is often the common aspects of the ensembles that are sought as a robust signal.

The CPC analyses tend to support the observation that the models often have more in common with each other than with the observations. The CPC has the ability to answer many pertinent questions posed in the arena of model comparison when used in conjunction with other techniques.

Corresponding author address: Dr. James S. Boyle, Prog. for Clim. Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Mail Stop L-264, P.O. Box 808, Livermore, CA 94550.

Email: boyle@cobra.llnl.gov

Abstract

The technique of common principal components (CPC) is applied to compare the results of a number of GCM simulations. The data used are the 120 monthly mean fields from 30 Atmospheric Model Intercomparison Project (AMIP) simulations and an ensemble of five AMIP integrations from a single GCM. The spatial grid and 120 time points allows the calculation of up to 31 covariance matrices for input into the CPC analyses.

The CPC methodology is applied to a variety of model comparision problems within the context of the AMIP experiment. The aspects of the simulations used for demonstration are the seasonal cycle of precipitation over the United States, the global 200-hPa velocity potential, and the difference between the 200-hPa divergence of four closely related AMIP models and the National Centers for Environmental Prediction–National Center for Atmospheric Research reanalysis and a small ensemble of simulations (five) of the European Centre for Medium-Range Weather Forecasts AMIP model.

These analyses demonstrate the utility of the CPC approach in identifying models systematic errors, the reduction of data in ensembles of simulation, and in model parameterization comparisons. The common errors among the models tend to highlight the area in which a gap in knowledge or parameterization implementation exists. In addition CPC analyses provide a more complete statistical picture of an emsemble of simulations within a single model than the traditional means and variances. It is often the common aspects of the ensembles that are sought as a robust signal.

The CPC analyses tend to support the observation that the models often have more in common with each other than with the observations. The CPC has the ability to answer many pertinent questions posed in the arena of model comparison when used in conjunction with other techniques.

Corresponding author address: Dr. James S. Boyle, Prog. for Clim. Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Mail Stop L-264, P.O. Box 808, Livermore, CA 94550.

Email: boyle@cobra.llnl.gov

1. Introduction

In this work common principal components is presented as a statistical tool to address the task of model comparison and verification. The need for such a tool became apparent during work on data from the Atmospheric Model Intercomparison Project (AMIP) of the World Climate Research Programme. The AMIP provides an infrastructure for the comparison of atmospheric general circulation models (AGCMs) and their response to the specified SST variations. The participants in AMIP simulate the global atmosphere for the decade 1979–88 using a common solar constant and CO2 concentration, and a common sequence of monthly averaged SST and sea ice dataset. An overview of AMIP is provided by Gates (1992).

The AMIP was intended to document the state of AGCM modeling and to facilitate diagnosis of the causes of any differences that showed themselves. The models do display a number of large differences, but it is not a trivial task to figure out the causes of these. For example, the tendency of the models to have poles that are too cold and a tropical region that is too warm has been documented for some time (Boer et al. 1992), but if there is any fundamental flaw that explains this error it has yet to be unambiguously identified.

There have been numerous useful statistical techniques put forth for the purposes of model verification and analysis. Many of these have as their major thrust a pairwise comparison. The model output is compared to the observations (e.g., 500-hPa geopotential) or the relationship between two variables (e.g., SST and precipitation) is examined between the models and the observations. This approach is well summarized in the comprehensive work of Bretherton et al. (1992). The AMIP analysis presents a slightly different variation on this theme. The large number of models to compare (∼30) places a practical restriction on the type of pairwise analyses that can be carried out. There is also value in ascertaining the systematic errors that cut across all the models. Presumably, such errors would indicate a fundamental gap in understanding or a defect in parameterization implementation that might be corrected if identified.

The desire for a concise analysis of systematic errors gives rise to two potentially conflicting requirements. First, there is a need for a parsimonious representation of the model and observed spatial and temporal evolution to facilitate comparison given the large mass of data. Second, it is important to be able to identify the physical processes that are related to the model errors and so the analysis needs to preserve sufficient spatial and temporal detail so that this can be accomplished. For example, global mean temperature is an efficient reduction of the temperature data, but these data alone will most likely not reveal the processes responsible for deviations from the observations.

An additional complicating factor in model verification comparison is the realization that the models are chaotic in the sense that some aspects of a simulation can change substantially when started from a slightly different set of initial conditions. It is important to be able to isolate the common aspects of an ensemble of model simulations. These aspects are presumably the robust features that are representative of systematic errors and basic behavior of the model.

The purpose of this paper is to put forward the technique of common principal components (CPC, Flury 1988) as a framework for model comparison and as a useful way to incorporate ensemble information. This technique has been applied recently to the comparison of ocean models by Frankignoul et al. (1995). This technique will be seen as a useful complement to the other analysis tools documented in the literature.

In the next section the basic concepts associated with CPCs will be outlined and some comparisons drawn to previously published techniques. In section 3 the datasets used in the following applications are described while section 4 presents a number of applications of the CPC technique. These point out the usefulness and interpretation of the CPCs and how they might complement other methods. This section also makes some points about the character of the AMIP integrations. Section 4 discusses some conclusions and extensions of the techniques and a summary of a strategy for model comparison, including ensemble integrations.

2. Common principal components

Common principal components are most easily described in comparison to the closely related principal component analysis. Principal components (PC), also referred to as empirical orthogonal functions (EOF), have a long history in atmospheric analysis since being introduced by Lorenz (1956). Principal components are invaluable tools in that they provide an efficient method of compressing the data in both space and time and present the results in terms of independent modes of variability. The principal vectors are the eigenvectors of the data covariance matrix whose elements are formed from the differences from some specified means. Each successive eigenvector is orthogonal to the set of previous ones and explains the maximum amount of possible remaining variance in the data. The eigenvectors are usually arranged in order according to the percentage of variance explained. The PCs are derived directly from the data themselves as opposed to some a priori set of functions such as in Fourier analysis. Often, but not necessarily, the leading vectors can be associated with some aspects of physical processes.

CPC is a generalization of the PC technique to the case of several groups. Rather than a single covariance matrix, there are now two or more. The basic assumption is that the PC transformation is identical in all the populations considered, while the variances associated with the components may vary between groups. Thus, similar to PC analysis, the output of the CPC results in a set of spatial patterns (vectors), but unlike PCs there are more than one time series associated with each pattern set. The ordering of the eigenvectors varies by group, and it is by no means necessary that all the groups share the same ordering of the vectors. An advantage in using the CPC model is that one can compare corresponding principal components. A formal test of significance for the hypotheses of (partial) commonality of the principal axes of representation of two (or several) fields of data along the line given in Flury (1988) is, however, not possible to implement directly. These tests of significance require that the sample fields (over discrete time instants) be independent. This is generally an incorrect assumption for almost all meteorological fields. This problem itself does not preclude the use of common principal components as a diagnostic tool for understanding the commonality of the fields. The temporal correlation can also be addressed by a sparser temporal sampling. This is impractical for the short span of time represented by the AMIP simulations but could be implemented for much longer integrations.

In terms of pairwise comparisons there exist powerful techniques, summarized in Preisendorfer (1988) and Bretherton et al. (1992). If there is a fiducial field to compare against, say the observed fields, then a commonly used method is to determine how the principal modes of the other groups project onto those of the basefield. The CPC analysis does not replace this technique, but rather complements it. The most effective means to illustrate how the CPC analysis can complement other methods is to provide some examples. As with any diagnostic technique there are situations where its use is inappropriate and other times when it can provide useful information. The next two sections will present examples on datasets where the CPC technique has proved useful.

The algorithm used for determining the common eigenstructure was that of Flury and Gautschi (1986). The code was tested against the International Mathematical Statistical Library (IMSL, 1991) routine KPRIN and the results were identical. The IMSL routine was not used since it was desired to have access to some intermediate results and the IMSL routines were unable to permit this. The covariance matrices were computed using the IMSL routine CORV. The PC analysis was carried out using the IMSL routine PRIN.

3. Applications of CPCs to data

a. U.S. precipitation

The data used in this part of the study consists of precipitation observations gridded to a 4° lat × 5° long grid. The observations are from surface stations over land from Schemm et al. (1992) and satellite microwave sounding unit (MSU) estimates from Spencer (1993) over the oceans. The bulk of the analysis grid used here is over the United States, where the observational network provides reliable precipitation fields. The 120-month mean was subtracted from each grid point to form the deviations. The seasonal cycle was retained since it was of interest to compare how well the GCMs simulated this cycle. The data composed a matrix of 120 time points at 95 space points. Figure 1 shows the spatial coverage of the data. The model data were interpolated to the observational grid using an area weighting scheme, which preserved the spatial mean. The data were all monthly means for the 120 months from January 1979 to December 1988. This means that some 31 covariance matrices served as the input to the CPC routine.

The leading two principal vectors of the observational precipitation dataset are shown in Fig. 1. The time series of the leading three principal vectors are in Fig. 2, while the percent variance explained by the leading four PCs is shown in Table 1. The leading PC can be interpreted in terms of the seasonal cycle of precipitation described by Hsu and Wallace (1976) and Horn and Bryson (1960). There is a winter maximum of rainfall on the west coast and off the Gulf and east coast, and a summer maximum in the central United States. It is useful to point out the high level of interannual variability displayed by the principal components in Fig. 2. The PVs were also computed from the 30 individual precipitation fields of the AMIP models. Just considering the first two vectors, this meant comparing 60 figures. From a first examination of the PVs, the models appeared to be poor in their simulation but it was difficult to make any general categorization. A CPC analysis was then performed on the 31 datasets (30 models + observations). The percent variance explained by the leading three CPCs for the observations is shown in Fig. 3.

Figure 3 can be used to illustrate a few points about this analysis. The first is that the percent variance explained can be ordered differently for each dataset. In Fig. 3 we have arbitrarily chosen to use the ordering of the observed dataset, but for some models these are clearly not the leading sequence. The second point is that the common mode dominant among the models is not the first mode for the observations, but the second mode. This information was quite useful in making sense of the PV charts of the individual models. Going back to the individual plots of the PVs and comparing the second PV of the observations to the leading PV of the models indicated a systematic problem with many of the models in depicting the precipitation over the eastern and central United States. This correspondence was not at all obvious when trying to look through all the charts due to the many slight variations displayed. The dominant seasonal cycle of the models resembles Fig. 1b, rather than Fig. 1a. The models tend to place the precipitation regime characteristic of the central United States too far eastward. Figure 3 lends credence to the common wisdom that the models tend to look more like each other than the observations.

In Fig. 3 it can also be noted that the models generally have a larger percent variance for a single CPC mode than do the observations. Those that do not in the three vectors displayed inevitably have such a component elsewhere. This is also true for the individual PC computations. Note that the percent variance of the observational dataset has dropped substantially between Table 1 and Fig. 3. In Fig. 3, the fit is to a common set of vectors, and the models have such a large variation compared to the observations that the common fit should not be expected to be as good as in the PC case. The corresponding CPC vectors for the observed set (not shown) do, however, bear a close qualitative resemblance to Fig. 1. Table 2 gives the percent variance explained by the leading principal components of the UCLA AMIP model, whose data are fairly typical of the model behavior. The steeper spectrum is an indication that the models occupy a simpler world than that depicted by the observations in Fig. 2 and Table 1. Figure 4 shows the time series of the leading three PCs of the UCLA model; every year is much like the next.

Beyond common characteristics of the models, Fig. 3 illustrates the substantial differences between the models. The models with low values in Fig. 3 do not have a large variance explained by any of the three leading CPCs of the observations. These outliers have less in common with the observations or with other models whose peaks in variance explained occur at other components. The models and observations do share a common variation in the West Coast precipitation. This is likely a consequence of the fact that the models all use the same set of observed SSTs, which play an important, and evidently dominant, role in determining the variation of the rainfall in the western United States.

This example illustrates that the CPC analyses can illuminate some key differences and commonality of the models and observations. These facts can then be taken into account as other tools are brought to bear on the problem. It can guide the choice of pairwise comparisons or indicate models that are so far off that they might not be profitable to examine further.

b. 200-hPa velocity potential

The input data for these calculations was the velocity potential computed from the 200-hPa winds. The observational data were from the National Centers for Environmental Prediction–National Center for Atmospheric Research (NCEP–NCAR) reanalyses. The model data were from two sources: the first was from the AMIP integrations, and the second was a small ensemble of AMIP integrations using the European Centre for Medium-Range Weather Forecasts (ECMWF) AMIP model. This ensemble had five members, each integration differing only in the initial conditions used. The initial conditions for the first run were the observed data for 1 January 1979, while the initial conditions for the subsequent runs were taken from the ending state of the previous run. The data were all monthly means for the 120 months from January 1979 to December 1988.

All the data were transformed to the orthogonal spherical harmonics and the spherical harmonic series was truncated at T10. This limits the results to large-scale features but allows a fit in the spatial domain since there are 110 spatial coordinates (110 coefficients of the spherical harmonics decomposition) and 120 time points. It was also desired to minimize the effects of the varying truncations between the models. The T10 truncation is well within the horizontal resolution of all the AMIP models. From the basic monthly data two sets of deviations were computed for the covariance matrix to be used for intercomparison. The first described the interannual variations, in which the seasonal cycle was removed by subtracting from each month the 10-yr mean of that month. The second set retained the seasonal cycle but computed the differences between the models and the NCEP–NCAR reanalysis fields for the 120 months of the AMIP decade. Insofar as the reanalysis depicts reality these data could be considered error fields. From these sets of spherical harmonic data the covariance matrices were formed for input into the PC and CPC algorithms.

c. Ensemble comparison

Outside the arena of multiple model comparison, the analysis of ensembles of simulations presents the potentially most useful aspect of CPCs for the analysis of GCM output. Figure 5 presents the time series of the leading CPC for the 200-hPa velocity potential for the five simulations of the ECMWF model ordered with respect to the AMIP simulation. These data represent the interannual variations, the monthly means being removed. The ensemble members all share the same first four CPCs, although they vary slightly at higher components. The leading four explain more than 80% of the variance and Table 3 gives the percent variance explained by the leading three. Two things are evident from Fig. 5. First, the simulations all follow a similar time evolution, which clearly reflects the pattern of the ENSO activity for the decade. This is clear from comparing Fig. 5 to Fig. 6, which is a plot of the Southern Oscillation index (SOI) from the Climate Analysis Center. This index is the difference in atmospheric pressure between Darwin, Australia, and Tahiti. It is tightly linked to the cycle of the equatorial Pacific SST and the atmospheric response to the SST. The two distinct dips in 1982–83 and 1986–87 represent two strong ENSO events, the 1982–83 event being the strongest on record. Figure 5b makes use of the same data as Fig. 5a except that it shows the differences in each simulation from the mean of all the simulations at each time point. These difference curves are in a sense a measure of the nondeterministic component of the flow. The mean time series has an ENSO signal that clearly rises above the noise level during the larger excursions of the SOI. During the interim periods one cannot distinguish the influence of the SST variations from the model’s intrinsic noise.

Figure 7 is a geographical plot of the divergence pattern of the leading CPV. It should be noted that most of the amplitude of the signal is over the Tropical Pacific and the pattern is broadly consistent with that expected from observed precipitation anomalies associated with ENSO events. There is enhanced upper-level divergence in the eastern equatorial Pacific. The CPC data compression retains sufficient information to be able to make some physical interpretation of the modes identified.

The point to be made here is not the discovery of new relationships but a measure of the impact of the SSTs on the simulations with varying initial conditions. The ECMWF model appears to have a robust, reasonable simulation of the ENSO response for the global 200-hPa divergence but between these remarkable periods the effects of the SSTs do not force a lockstep in the ensemble simulations.

Ensemble integrations are now commonplace among the major weather forecasting centers, Toth and Kalnay (1993). The CPC methodology provides a framework for combining ensembles into a single field. This combining is necessary since the number of members of the ensemble is often greater than 20. This many simulations provides more information than can be easily assimilated by a forecaster. The CPC technique provides a consensus summary that is usually the type of information needed. There are some indications that the SSTs are predictable for a month or season in advance, and if the atmospheric models are then driven by these SSTs, a climate prediction can be made. A CPC analysis of an ensemble of such atmospheric predictions would be an efficient way of producing a robust climate forecast.

Figure 8 is the same as Fig. 5b, except for CPC 2. In these curves there is also an influence of the ENSO variations. The mean curve is less above the “noise” level than in Fig. 5b. By the third CPC (not shown) the mean curve is almost completely engulfed by the ensemble variations, which restricts the number of conclusions that can be drawn using a single run or even five runs from this perspective. Figure 5 indicates that beyond the ENSO maxima the model and the observations do not have a great deal in common for this mode. Each has a different response given a common SST forcing for the decade. This is not unexpected since on the global scale a great many more variables influence the interannual variability of the upper-level divergence field besides the equatorial Pacific SSTs.

In the foregoing we have presented some aspects of the relationships of the members of the ensemble to each other. A logical next step is to ascertain what relation the five ensemble members have to the observed data. An easy path is to just include the observations as another dataset with the five ensembles in performing the CPC analysis. The percent variance explained by the leading three PCs for this analysis, which now is over six datasets (five ensemble members + NCEP reanalysis), is shown in Fig. 9. Figure 10 is the time series for the leading two CPCs for this analysis with the NCEP reanalyses shown in the solid line. Figure 10a shows that the leading mode is associated with the ENSO variations. Comparing Figs. 10a and 10b indicates that the model does a fair job in tracking the ENSO variations for the period; however, beyond the first CPC the correspondence almost vanishes. In Fig. 10a the ensembles show an agreement with each other and the observations, while in Fig. 10b the ensembles agree with each other but are at odds with observations. The percent variance explained (Fig. 9) indicates that the leading vector is more dominant in the observations. A single run would be adequate to capture this aspect of the ECMWF AMIP integration.

d. Model development and evolution

1) Divergence error fields

In the work presented next, the difference fields computed by subtracting the 200-hPa divergence of the NCEP–NCAR reanalysis from the AMIP models are compared. In this case the common behavior would reflect some common error in the models. In these data the seasonal cycle was retained, and the covariance matrices were formed from the deviations from the 120-month mean.

The most brute force approach is to take all the 30 models for which we had data and to calculate the CPCs for the difference fields between the models and the reanalysis. The results for this analysis are not shown since they did not reveal much. The leading three common vectors, which were shared by all the models, closely resembled the PC analysis of the NCEP–NCAR reanalysis. This could be interpreted as meaning that the models are all in error and that what they share most in common is a difference with the reanalysis with no particular common pattern. Going into the analysis, it was hoped that the common error patterns might indicate specific regions or times when the models had particular problems in simulating the upper-level divergence. It might be thought that there would emerge specific locations where the convective parameterizations would evince a common breakdown, especially in the Tropics. However, for these data there is not a common localized systematic error as was evident in the U.S. precipitation data. The models’ common feature is that they are different from this observational dataset, but evidently they are different in a host of ways.

The CPC analysis was then applied to a selected subset of four models, which a priori were expected to have some common type of error patterns. The models chosen for this analysis were the ECMWF, U.K. Universities’ Global Atmospheric Modelling Programme (UGAMP), the Max-Planck Institute for Meteorology (MPI) ECHAM-3 AMIP-models, and the MPI ECHAM-4 model run using the AMIP boundary conditions. These models were chosen since they all share the same basic formulation of the ECMWF model. Indeed, they all started from the same code. The UGAMP differs from the ECMWF only in the penetrative convective scheme used. UGAMP uses the Betts–Miller convective adjustment whereas the ECMWF uses the Tiedtke mass flux scheme, as do the MPI models. Although sharing the same convective parameterization, the MPI and ECMWF models differ in many ways as documented by Phillips (1994). The chief differences are the treatment of the land surface processes and the radiation, particularly the interaction with clouds. Figure 11 presents the percent variance explained for the leading three vectors. It is evident that the ECMWF and UGAMP models and the two MPI models form two pairs. It might be felt that for this field, upper-level divergence, the convective parameterization might play an overwhelming role in determining the model characteristics. However, in this case the models sharing the same convective scheme are different, while the two models that are alike except for this parameterization are similar. Figure 12 shows the time series of the two leading CPCs for this group of models.

The MPI changes were intended to improve the performance of the ECMWF forecast model in climate simulations. The MPI models show a clear reduction in amplitude of the leading CPC, which is dominated by the seasonal cycle. The two MPI simulations appear to differ mostly in amplitude, except during the ENSO events of 1982–83 and 1986–87 when they also got out of phase. Table 4 provides the mean absolute values of the time series, which in the case are actually errors with respect to the NCEP–NCAR reanalysis. There is actually a slight increase in the difference for the second vector for the MPI models with respect to the other two. Figure 13 shows the two leading CPVs. There is considerable amplitude in the Tropics, which might be expected but there is also significant contribution in the midlatitudes. The maximum centered just west of Central America is a characteristic error for this suite of models. The variations of the monsoon over eastern Asia are also evident in the figure.

The ensemble of five ECMWF integrations was analyzed for the velocity potential difference field in order to be able to judge if the variations between the single runs of the four models lie outside what might be expected from the intrinsic variability. It would be better to run such an ensemble analysis on each model, but the ECMWF model is most probably a fair proxy since they all share the ECMWF dynamical core and many parameterizations. The results are shown in Figs. 14 and 15. What is interesting is that the UGAMP and ECMWF models are actually indistinguishable given the variability represented by the ensemble. It would appear that it would take a larger number of integrations to establish if the UGAMP and ECMWF are truly unique from this particular perspective.

4. Discussion and conclusions

In the foregoing discussion we have used the CPC methodology on the AMIP model data, an ensemble of ECMWF model simulations, and observed data to show the following.

1) The AMIP models evince a common tendency to displace the precipitation regime characteristic of the central United States to the East Coast.

2) The large-scale, 200-hPa divergence of an ensemble of five ECMWF AMIP simulations shows a common response to the ENSO events of the 1979–88 decade. However, between these prominent events the intrinsic variability of the model obscures any common SST forcing.

3) An analysis of four AMIP simulations with models having almost identical dynamical cores indicates the effects on the upper-level divergence of varying the convective parameterization may be less important than other modifications.

4) The CPC technique gives a quantitative basis for the common wisdom that the models tend to look more like each other than the available observations.

CPCs have been shown to be useful in the comparison of the output of a large number of GCMs. This type of analysis attempts to characterize the systematic model errors, as there is value in ascertaining the systematic errors that cut across all the models. Presumably, such errors would indicate a fundamental gap in understanding or a defect in parameterization implementation that might be corrected if identified. CPCs have also demonstrated utility as a straightforward way to summarize the robust results of an ensemble of simulations from a single model. In the future, this technique will be applied to time series of longer simulations. One idea is to compare the common components between decades of a multidecadal coupled climate system model.

It is important to indicate what the CPCs are not. The technique is not a replacement for describing the coupled patterns of, for example, SST and 500-mb geopotential. The analysis of coupled patterns is well described by Bretherton et al. (1992) using a number of sophisticated techniques. Nor is the CPC a replacement for the simple PC. The CPCs are a compromise among the covariance matrices provided to the algorithm and do not preserve the powerful concise description inherent in PCs. The CPC does not permit the rotation of individual members of the group being analyzed. In this work the CPC analyses have been used to complement the PC information.

For the purpose of comparing two fields, the technique of projecting the model field onto the observational PC, as described by Priesendorfer (1988), provides information that the CPC cannot. However, there are fields for which the observational quality is dubious. In any case the CPC provides a kind of consensus viewpoint of corresponding components that has been shown to be useful. The CPC technique is well worth adding to the tools of data analysis. The concise description ofensemble data probably holds the greatest promise for general modeling beyond the AMIP type intercomparisons. In model intercomparision studies the CPC analysis certainly contributes an important perspective not easily achieved by other techniques.

Acknowledgments

The cooperation of the ECMWF in making their forecast model available and in providing expert technical advice for this research is gratefully acknowledged. This work was performed under the auspices of the Department of Energy Environmental Sciences Division by the Lawrence Livermore National Laboratory under Contract W-7405-ENG-48.

REFERENCES

  • Boer, G. J., and Coauthors, 1992: An intercomparison of the climates simulated by 14 atmospheric general circulation models. J. Geophys. Res.,97, 12 771–12 786.

  • Bretherton, C. S., C. Smith, and J. Wallace, 1992: An intercomparison of methods for finding coupled patterns in climate data. J. Climate,5, 541–560.

  • Flury, B., 1988: Common Principal Components and Related Multivariate Models. J. Wiley, 258 pp.

  • ——, and W. Gautschi, 1986: An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. Siam. J. Sci. Statist. Comp.,7, 169–184.

  • Frankignoul, C., S. Fevrier, N. Sennechael, J. Verbeek, and P. Braconnot, 1995: An intercomparison between four tropical ocean models: Thermocline variability. Tellus,47A, 351–364.

  • Gates, W. L., 1992: AMIP: The Atmospheric Model Intercomparison Project. Bull. Amer. Meteor. Soc.,73, 1962–1970.

  • Horn, L. H., and R. A. Bryson, 1960: Harmonic analysis of the annual march of precipitation over the United States. Ann. Assoc. Amer. Geogr.,50, 157–171.

  • Hsu, C.-P., and J. M. Wallace, 1976: The global distribution of the annual and semiannual cycles in precipitation. Mon. Wea. Rev.,104, 1093–1101.

  • IMSL, 1991: IMSL Stat/Library. IMSL Inc., 1578 pp.

  • Lorenz, E. N., 1956: Empirical orthogonal functions and statistical weather prediction. Science Rep. 1, Dept. of Meteorology, MIT, 49 pp. [NTIS AD 110268.].

  • Phillips, T. J., 1994: A summary documentation of the AMIP models. PCMDI Rep. 18, Program for Climate Model Diagnosis and Intercomparison, University of California, Lawrence Livermore National Laboratory, 343 pp. [Available from Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, CA 94551-9900.].

  • Preisendorfer, R. W., 1988: Principal Component Analysis in Meteorology and Oceanography. Vol. 17, Developments in Atmospheric Science, Elsevier, 425 pp.

  • Schemm, J. S., S. Schubert, J. Terry, and S. Bloom, 1992: Estimates of monthly mean soil moisture for 1979–1989. NASA Tech. Memo. 104571, 254 pp. [Available from NASA/Goddard Space Flight Center, Greenbelt, MD 20771.].

  • Spencer, R. W., 1993: Global oceanic precipitation from the MSU during 1979–91 and comparisons to other climatologies. J. Climate,6, 1301–1326.

  • Toth, Z., and E. Kalnay, 1993: Ensemble forecasting at NMC: The generation of perturbations. Bull. Amer. Meteor. Soc.,74, 2317–2330.

Fig. 1.
Fig. 1.

(a) The leading principal vector of the observed precipitation dataset deviations from the 120-month mean. This mode explains 25% of the variance. Contour interval is 0.01. Dashed lines indicate negative values. Solid lines indicate zero and positive values. (b) As in (a) except for the second principal vector. This explains 14% of the variance.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 2.
Fig. 2.

The leading three principal components of the observed precipitation dataset deviations from the 120-month mean.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 3.
Fig. 3.

Percent variance explained for a common PC analysis for the observed precipitation and AMIP model precipitation dataset deviations from the 120-month mean. The components are ordered in the sequence of the observed data.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 4.
Fig. 4.

The leading three principal components of the UCLA AMIP precipitation dataset deviations from the 120-month mean.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 5.
Fig. 5.

(a) The leading common PC for each of the five members of the ECMWF AMIP ensemble of simulations for the 200-hPa interannual variations of the velocity potential. (b) The same data as in (a) except the mean (thick solid line) of the five members is shown along with the deviations of each simulation from this mean.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 6.
Fig. 6.

The Southern Oscillation index (SOI) from the Climate Prediction Center for 1979–88. This index is the difference in sea level pressure measured at Darwin, Australia, and Tahiti. The filtered curve is produced using an eight-point Gaussian smoothing.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 7.
Fig. 7.

The leading common principal vector of the five ECMWF AMIP ensembles for the 200-hPa interannual variations of velocity potential. Contour interval is 0.01. The dashed lines indicate negative values. The solid lines are positive and zero.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 8.
Fig. 8.

(a) As in Fig. 5b except for the second component. (b) As in Fig. 5b except for the third component. Note the scale change on the ordinate.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 9.
Fig. 9.

Percent variance explained for the leading three common principal components of the observed (NCEP reanalysis) and five ECMWF AMIP simulations for the 200-hPa velocity potential.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 10.
Fig. 10.

(a) The leading common principal component of the analysis with the NCEP reanalysis and the five members of the ECMWF AMIP ensemble of simulations for the 200-hPa interannual variations of the velocity potential. The thick solid line is the reanalysis data. (b) As in (a) except for the second component.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 11.
Fig. 11.

Percent variance explained for a common principal component analysis for the difference in the global 200-hPa velocity potential of four AMIP models and the NCEP–NCAR reanalyses. The components are ordered in the sequence of the ECMWF model.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 12.
Fig. 12.

(a) The leading common principal component for each of the four models for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses. (b) As in (a) except for the second CPC component.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 13.
Fig. 13.

(a) The leading common principal vector of the four models for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses 200-hPa interannual variations of velocity potential. Contour interval is 0.01. The dashed lines indicate negative values. The solid lines are positive and zero. (b) As in (a) except for the second CPC vector.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 14.
Fig. 14.

Percent variance explained for a common principal component analysis for the difference in the global 200-hPa velocity potential of the five ECMWF ensemble members and the NCEP–NCAR reanalyses.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Fig. 15.
Fig. 15.

(a) The leading common principal component for each of the five ECMWF ensemble members for the difference in the global 200-hPa velocity potential of each model from the NCEP–NCAR reanalyses. (b) As in (a) except for the second CPC component.

Citation: Journal of Climate 11, 5; 10.1175/1520-0442(1998)011<0816:UCPCFC>2.0.CO;2

Table 1.

Percent variance explained for principal components of observed precipitation.

Table 1.
Table 2.

Percent variance explained for principal components of precipitation simulated by the UCLA model.

Table 2.
Table 3.

Percent variance explained for common principal components of 200-hPa velocity potential of the ECMWF AMIP model.

Table 3.
Table 4.

Mean absolute error of the CPCs for the four models shown. The error is with respect to the reanalysis data.

Table 4.
Save