1. Introduction
The potential economic value of monthly, seasonal, and longer-range forecasts having even modest skill has motivated attempts to produce and improve such forecasts for more than half a century (Namias 1968; Livezey 1990). During this period the methods employed to produce these forecasts have evolved considerably. Earlier efforts relied on a combination of objective and subjective, experience-based guidance founded on empirical measures of persistence and available data on surface anomalies (Namias 1964; Wagner 1989). However, as computing power increased and observation-based datasets improved, fully objective methods became the norm.
Two types of objective methods currently in use are statistical forecasts, which employ empirically derived relations based on past behavior, and dynamical forecasts, in which the equations describing the climate system are integrated in time from initial conditions constrained by observations. Until relatively recently dynamical atmospheric forecasts typically employed a “two tier” methodology whereby offline predictions of future anomalies of sea surface temperature (SST) serve as boundary conditions for an atmospheric general circulation model (AGCM). Increasingly, however, long-range forecasts use a “one tier” approach based on a coupled global climate model (CGCM). In this case, the interactive ocean component of the model provides future SSTs as an integral part of the forecast. Because of the fundamentally probabilistic nature of long-range predictions, dynamical forecasts usually are based on ensembles of predictions, from which probabilistic information may be derived. It has furthermore been shown in numerous studies (e.g., Kharin et al. 2009) that multimodel ensembles tend to offer better performance than single-model ensembles for a given ensemble size.
The evolution of operational seasonal forecasting in Canada has reflected these developments. Beginning in the mid-1990s, Environment Canada's Canadian Meteorological Centre (CMC) produced one-season forecasts using a two-tier, multimodel dynamical forecasting system in which future SST anomalies were specified by persisting their mean values from the month preceding the forecast. Initially this system employed six forecasts from each of two AGCMs, with bias correction and skill assessments obtained from a set of retrospective forecasts comprising the Historical Forecasting Project (HFP; Derome et al. 2001). In 2007 this system was upgraded to 10 forecasts from each of four AGCMs, with corresponding retrospective forecasts provided by the second phase of the HFP, or HFP2 (Kharin et al. 2009).
Although SST anomaly persistence provides a reasonable forecast of SST at relatively short range, this ceases to be the case beyond a few months when the forecast range approaches the decorrelation time scale for SST anomalies (Goddard and Mason 2002). For this reason, the HFP2 two-tier dynamical forecasts were restricted to a range of four months, providing seasonal (three month) mean surface temperature and precipitation at zero- and one-month lead, as well as first-month forecasts for surface temperature. Longer leads necessitated a different approach, so seasonal forecasts of surface temperature within Canada at leads of three, six, and nine months were obtained statistically through canonical correlation analysis (CCA) as described in Shabbar and Barnston (1996).
A one-tier system eliminates the need to apply different forecast methods at different leads as in the hybrid two-tier/statistical system described above, and has the potential also to improve the quality of the forecasts. The development of a one-tier multiseasonal forecasting system suitable for operational use in Canada began with a pilot project, the first phase of the Coupled Historical Forecast Project (CHFP1; Merryfield et al. 2010), that adapted the Canadian Centre for Climate Modeling and Analysis (CCCma) CGCM to seasonal forecasting by applying a simple initialization method to the then-current CGCM3.1(T63) model version. This approach, similar to that of Keenlyside et al. (2005), relaxed model SSTs to observation-based time series during a multiyear period preceding each forecast. Although far from optimal, this method did initialize the crucial equatorial Pacific region with some skill, leading to reasonably skillful El Niño–Southern Oscillation (ENSO) forecasts and global skills competitive with those of the HFP2 system despite the smaller ensemble size (one model with 10 ensemble members vs four models with 10 ensemble members each). In addition, the skill of CHFP1 SST forecasts out to 12 months considerably exceeded that of the persistence assumption employed by the HFP2 two-tier forecasts.
CHFP1, although modest in scope and leaving considerable room for improvement, served to develop computational infrastructure and associated verification tools in parallel with efforts to improve the model and its initialization. In addition, CHFP1 provided a baseline against which the skill of subsequent versions of the forecast system could be compared.
The remainder of the paper describes the second phase of the Coupled Historical Forecast Project (CHFP2), which led to the development of the Canadian Seasonal to Interannual Prediction System (CanSIPS). This second-generation system considerably improved on the first-generation CHFP1 system through extensive model development, its use of two model versions, CCCma Coupled Climate Model, versions 3 and 4 (CanCM3 and CanCM4, respectively), and the implementation of a far more comprehensive initialization procedure. It was designed as an operation-ready system, and replaced Canada's previous operational seasonal prediction system, which was based two-tier dynamical and longer-range statistical forecasts as detailed above, in late 2011. Sections 2 and 3 describe the formulation and properties of CanCM3 and CanCM4, which are used mainly for climate forecasting and have not been described in detail elsewhere. Section 4 focuses on the procedures used to initialize these models, and a summary and discussion are provided in section 5. A companion paper will describe in detail the performance of the CanSIPS historical forecasts.
2. CanCM3 and CanCM4 climate models
CanSIPS combines ensemble forecasts from two versions of the CCCma climate model, CanCM3 and CanCM4, in order to take advantage of the generally greater skill of multimodel ensembles for a given ensemble size (e.g., Kirtman and Min 2009; Kharin et al. 2009). These two models share a common ocean component, CCCma's fourth-generation ocean model (CanOM4), but differ in their atmospheric components: CanCM3 uses CCCma's third-generation atmospheric general circulation model (CanAM3; also known as AGCM3), whereas CanCM4 employs the fourth-generation version (CanAM4). By comparison, the CHFP1 pilot project described in the previous section used CanAM3 with an earlier ocean model version CanOM3; this was essentially the CGCM3.1(T63) climate model that contributed to the Intergovernmental Panel on Climate Change Fourth Assessment Report (IPCC AR4), except that flux adjustments were not used in the hindcasts. These model configurations are summarized in Table 1.
CCCma climate model configurations used for CHFP2/CanSIPS in relation to other applications (IPCC AR4, CHFP1, and IPCC AR5).


a. Model components
The CanAM3 atmospheric component is described in detail by McFarlane et al. (2005) and Scinocca et al. (2008). Briefly, it is a spectral model which uses a T63 truncation and 31 hybrid (sigma pressure) vertical coordinate levels extending from the surface to 1 hPa. Physical tendencies are computed on the linear transform grid yielding a horizontal grid spacing of approximately 2.8°. Land surface processes are represented by version 3 of the Canadian Land Surface Scheme (CLASS; Verseghy 2000), and sea ice dynamics are treated according to the cavitating fluid approach of Flato and Hibler (1992).
CanAM4 operates with the same horizontal resolution and upper boundary as CanAM3, but has 35 hybrid vertical coordinate levels providing more uniform resolution across the tropopause. Upgraded physical parameterizations in CanAM4 include a correlated-k distribution radiative transfer scheme (Li and Barker 2005), a more general treatment of radiative transfer in the presence of clouds using the Monte Carlo independent column approximation (Barker et al. 2008), a prognostic bulk aerosol scheme with a full sulphur cycle, along with organic and black carbon, mineral dust, and sea salt (Lohmann et al. 1999; Croft et al. 2005), a fully prognostic single-moment cloud microphysics scheme (Lohmann and Roeckner 1996; Rotstayn 1997; Khairoutdinov and Kogan 2000), and new shallow convection scheme (von Salzen et al. 2005). Land surface and sea ice are treated as in CanAM3.
The ocean component CanOM4, common to CanCM3 and CanCM4, differs in several important respects from its predecessor CanOM3 used for IPCC AR4 and CHFP1. Vertical resolution is increased from 29 to 40 levels with vertical spacings ranging from 10 m near the ocean surface to greater than 300 m at abyssal depths. Subsurface heating due to penetration of shortwave radiation beneath the first level is represented as in Zahariev et al. (2008), with chlorophyll concentrations specified according to daily climatological means derived from the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) satellite data for 1998–2002 (Yoder and Kennelly 2003). Diapycnal mixing in the surface mixed layer and ocean interior are represented according to the K-profile parameterization of Large et al. (1994), together with a version of the tidal mixing parameterization of Simmons et al. (2004). Horizontal friction is specified according to the anisotropic viscosity formulation of Large et al. (2001), whereas eddy-induced transport and along-isopycnal diffusion are as in Gent et al. (1995).
b. Coupling
CanCM3 and CanCM4 couple their atmosphere and ocean components once per day. In both models, the ocean component receives daily mean surface heat, freshwater, and momentum fluxes computed by the atmospheric component, and after stepping forward by one day passes updated daily mean SST values back to the atmosphere. Ocean surface velocities are not taken into account in computing the surface momentum flux. The surface beneath each atmospheric cell is entirely ocean or land, with precisely six ocean grid cells under each atmospheric grid cell. While this configuration simplifies model physics and the interpolation of coupling fields, it enforces the relatively low atmospheric model resolution on the structure of coastlines. In contrast to previous CCCma climate model versions, no flux adjustments are applied in either CanCM3 or CanCM4 (Table 1).
c. Radiative forcing
CanCM3 and CanCM4 both represent anthropogenic influences on radiative forcing, inclusion of which has been shown to improve seasonal forecast skill in predicting global mean and regional temperatures and temperature trends (Doblas-Reyes et al. 2006; Liniger et al. 2007). In CanCM3 this is implemented through equivalent CO2 forcing representing the effects of all greenhouse gases (GHGs), with compensating influences of anthropogenic aerosols represented crudely by imposing a 40% reduction on increases of this GHG forcing above preindustrial levels. CanCM4 employs a far more comprehensive treatment of radiative forcing. Concentrations of CO2 and other radiatively important GHGs until 2005 are specified according to the representative concentration pathway (RCP) historical scenario developed for the IPCC Fifth Assessment Report (AR5) Coupled Model Intercomparison Project Phase 5 (CMIP5), whereas after 2005 the RCP4.5 scenario (Paolino et al. 2010) is used. Direct and indirect effects of aerosols on climate are treated through a prognostic bulk aerosol scheme with full sulphur cycle, organic and black carbon, mineral dust, and sea salt (von Salzen et al. 2005; Ma et al. 2010), with anthropogenic emissions specified through the CMIP5 historical and RCP4.5 scenarios. Solar cycle irradiance variations and volcanic stratospheric aerosols are treated according to CMIP5 recommendations. Because explosive volcanic eruptions cannot accurately be predicted, volcanic effects in the CanCM4 component of CanSIPS are represented by exponential decay of the initial volcanic stratospheric aerosols with a time scale of one year.
3. Model performance
This section examines aspects of the behavior of freely running CanCM3 and CanCM4 simulations that bear upon the suitability of these models for subseasonal to multiseasonal forecasting. In particular, the models should be able to reasonably represent (i) global trends attributable to changes in radiative forcing, (ii) climatological averages and the mean seasonal cycle, and (iii) unforced climate variability on time scales relevant to long-range forecasting. These properties of the two models are examined below.
a. Global temperature trends
The gross effect of anthropogenic forcing changes is apparent in time series of global mean temperature. Figure 1 shows such time series for the period 1970–2009 from the Goddard Institute for Space Studies (GISS) observational dataset (Hansen et al. 2010; Fig. 1, top), a freely running, historically forced CanCM3 simulation (Fig. 1, middle), and 10 historically forced CanCM4 simulations (Fig. 1, bottom), where values shown are anomalies relative to 1970–90 averages. In addition to anthropogenically forced trends, these time series exhibit signatures of unforced internal climate variability, uncorrelated between observations and different model runs, which is attributable to ENSO and other influences (Fyfe et al. 2010). The observational and CanCM4 time series furthermore show transient coolings due to the large explosive volcanic events of 1982 and 1991; these are particularly evident in the CanCM4 ensemble mean because the unforced climate variability tends to be filtered out by the ensemble averaging process. These coolings are not present in the CanCM3 time series because that model does not represent volcanic aerosol effects.

Monthly global average surface temperatures for 1970–2009, expressed as anomalies relative to the 1970–90 mean. (top) GISS observational dataset; (middle) freely running CanCM3; (bottom) freely running CanCM4 (ensemble mean of 10 runs, with range of values indicated by shading). Values missing from the observations are excluded in constructing the model averages. Temperature scale is the same for each case.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Monthly global average surface temperatures for 1970–2009, expressed as anomalies relative to the 1970–90 mean. (top) GISS observational dataset; (middle) freely running CanCM3; (bottom) freely running CanCM4 (ensemble mean of 10 runs, with range of values indicated by shading). Values missing from the observations are excluded in constructing the model averages. Temperature scale is the same for each case.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Monthly global average surface temperatures for 1970–2009, expressed as anomalies relative to the 1970–90 mean. (top) GISS observational dataset; (middle) freely running CanCM3; (bottom) freely running CanCM4 (ensemble mean of 10 runs, with range of values indicated by shading). Values missing from the observations are excluded in constructing the model averages. Temperature scale is the same for each case.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
To facilitate accurate comparison between the modeled trends and those represented by the GISS and Hadley Centre/Climatic Research Unit temperature (HadCRUT) (not shown) observational datasets, two further processing steps were undertaken. First, the times and grid locations at which GISS and HadCRUT data are missing due to a lack of observations were excluded in computing the modeled global means (the model time series in Fig. 1 exclude data missing from the GISS dataset). Second, the signatures of dynamically induced atmospheric variability, ENSO, and explosive volcanic eruptions were removed using procedures described in Thompson et al. (2009); trends in the residual time series were then computed as in Fyfe et al. (2010), who demonstrated that this procedure substantially increases confidence in the calculated trends. The resulting estimated modeled trends and their statistical uncertainties are compared with those obtained from the GISS and HadCRUT datasets in Table 2. Of note is that while CanCM3 has a trend comparable to that inferred from observations, CanCM4's trend significantly exceeds that estimated from both observational datasets. This tendency is also evident in CMIP5 decadal predictions using CanCM4 (Fyfe et al. 2011; Kharin et al. 2012) as well as some other models (Kim et al. 2012).
Trends in 1970–2009 time series of monthly global mean surface temperature, from HadCRUT and GISS observations, one CanCM3 historical run, and an ensemble of 10 CanCM4 historical runs. Values missing from the observations are excluded in constructing the respective model trends. Uncertainties represent 95% confidence intervals.


b. Mean climate and seasonal cycle
All climate models have imperfections that lead to biases in simulated climate. In a forecasting context where model initial conditions are constrained by the observed climate state at a particular time, this implies that the model will progressively drift from a comparatively realistic simulated climate at the beginning of the forecast period toward one that increasingly becomes imprinted with the biases of the unconstrained model. This drift is commonly removed by subtracting the forecast climatology, dependent on the time of year and forecast lead time and computed by averaging over a large set of historical forecasts, from the forecast fields to obtain predicted anomalies. While such a procedure compensates for model biases and consequent drifts, it is clearly desirable that biases be minimized, as errors in the mean state can affect the response of the atmosphere to surface forcing anomalies even when the anomalies are represented accurately (e.g., Balmaseda et al. 2010).
1) Ocean climatology
Sea surface temperature is the most crucial ocean attribute for multiseasonal forecasting because it is the main avenue by which the relatively slowly evolving ocean influences the atmosphere. Although one-tier forecasting systems have a significant advantage over their two-tier counterparts in being able to predict the dynamical evolution of future SST anomalies, they also suffer an inherent disadvantage in that the model SST climatologies contain biases, whereas such biases can be eliminated in the specified SSTs used by two-tier forecasts. Having a realistic SST climatology is thus a highly desirable property for a multiseasonal forecast model.
The annual mean SST biases for CanCM3 and CanCM4, obtained by subtracting the 1982–2010 mean of the optimum interpolation SST (OISST, version 2) observational analysis (Reynolds et al. 2002)1 from like averages for historical model runs, are shown in Fig. 2. The annual global mean bias in CanCM3 is negligible, whereas mean SST in CanCM4 is 0.29°C cooler than observed (Table 3). Local biases are generally modest, less than ±2°C or so in most locations. Notable exceptions include a cold “bull's-eye” in the North Atlantic, attributable to an excessively zonal North Atlantic Current (Randall et al. 2007), and strong warm biases (more severe in CanCM3 than CanCM4) off the coasts of western tropical South America and Africa that are associated with underrepresented coastal upwelling and marine stratocumulus clouds (de Szoeke et al. 2010; Zheng et al. 2011). These specific localized strong biases are prevalent in many current-generation climate models.

Annual mean SST biases for 1982–2010 in (top) CanCM3 and (bottom) CanCM4 historical runs, relative to the OISST observational dataset for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Annual mean SST biases for 1982–2010 in (top) CanCM3 and (bottom) CanCM4 historical runs, relative to the OISST observational dataset for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Annual mean SST biases for 1982–2010 in (top) CanCM3 and (bottom) CanCM4 historical runs, relative to the OISST observational dataset for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Global mean SST bias and mean absolute error for the CanCM3 and CanCM4 models, based on differences between 1982–2010 means for free model runs under historical forcing and the OISST observational analysis as shown in Fig. 2.


Mean absolute error (MAE) provides an overall measure of model errors in SST climatology. For annual mean SST, MAE in CanCM4 is lower than in CanCM3 despite the smaller mean bias of the latter, indicating that CanCM4 tends to have smaller regional SST errors than CanCM3. The mean bias and MAE for individual seasons are not dramatically larger than for the annual mean (Table 3), indicating that the relatively small errors in annual mean SST do not arise from cancellation of much larger seasonally varying errors.
The equatorial Pacific Ocean is a particularly important region to model accurately because of its fundamental role in ENSO (Guilyardi et al. 2009). A common model bias in this region is an excessively cool upwelling or “cold tongue” region stretching across much of the equatorial Pacific (e.g., Reichler and Kim 2008). This bias is present in CanCM3 and CanCM4 although it is not severe as it does not exceed −1.2°C in CanCM3 and −1.5°C in CanCM4.
The accuracy of the modeled seasonal cycle in the equatorial Pacific is important in climate models in part because its amplitude tends to anticorrelate with the modeled level of ENSO variability (Guilyardi 2006), although this is not the case for CanCM3 versus CanCM4. Figure 3 shows the average seasonal cycle about annual mean equatorial Pacific SST for 1982–2010 according to the OISST analysis and historical runs of CanCM3 and CanCM4. The observed seasonal cycle is dominated by the annual harmonic despite the strong semiannual component of solar radiation at the top of the atmosphere (Fu and Wang 2001). The seasonal cycle in the models is also primarily annual rather than semiannual, but is somewhat too weak and exhibits phasing errors, with a delayed spring maximum and premature autumn minimum.

Climatological mean seasonal cycle of equatorial Pacific SST relative to the annual mean, for the period 1982–2010 from the (left) OISST observational dataset, (middle) CanCM3, and (right) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Climatological mean seasonal cycle of equatorial Pacific SST relative to the annual mean, for the period 1982–2010 from the (left) OISST observational dataset, (middle) CanCM3, and (right) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Climatological mean seasonal cycle of equatorial Pacific SST relative to the annual mean, for the period 1982–2010 from the (left) OISST observational dataset, (middle) CanCM3, and (right) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
The equatorial Pacific thermocline is of fundamental importance because its vertical motions strongly influence ENSO-related SST anomalies particularly in the eastern Pacific. Also, its properties such as zonal slope, which is strongly connected to the mean zonal wind stress, and mean depth strongly influence ENSO amplitude, period, and stability in simplified models of ENSO (e.g., Fedorov and Philander 2001). Physically the thermocline is defined as the depth of the maximum vertical temperature gradient, whereas the depth of the 20°C isotherm serves as a widely used practical definition that is applicable for present-day climate (Yang and Wang 2009). Figure 4 shows cross sections of 1991–2000 means of ocean temperature versus depth and longitude at the equator in the Simple Ocean Data Assimilation (SODA) 1.4.2 ocean analysis (Carton and Giese 2008), CanCM3, and CanCM4. The observed 20°C isotherm is indicated by the dashed red curve in all three panels. Its depth in the western and central Pacific is correctly represented in both models, whereas in the eastern Pacific the modeled 20°C isotherm is slightly too shallow, particularly in CanCM3. Also, the thermocline in CanCM3 tends to be too diffuse (i.e., vertical gradients near the depth of the 20°C are too weak except in the far eastern Pacific), which leads to a significant warm bias beneath the thermocline.

Annual mean equatorial Pacific temperature as a function of longitude and depth for the period 1991–2000 from (a) the SODA 1.4.2 ocean analysis, (b) CanCM3, and (c) CanCM4. The observed 20°C isotherm is indicated by the dashed red curves.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Annual mean equatorial Pacific temperature as a function of longitude and depth for the period 1991–2000 from (a) the SODA 1.4.2 ocean analysis, (b) CanCM3, and (c) CanCM4. The observed 20°C isotherm is indicated by the dashed red curves.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Annual mean equatorial Pacific temperature as a function of longitude and depth for the period 1991–2000 from (a) the SODA 1.4.2 ocean analysis, (b) CanCM3, and (c) CanCM4. The observed 20°C isotherm is indicated by the dashed red curves.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Currents in the equatorial Pacific are primarily zonal and are dominated by the Equatorial Undercurrent (EUC). Figure 5 shows annual mean zonal velocity as a function of depth and longitude along the equator (left-hand panels) and as a function of depth and latitude at 140°W (right-hand panels) based on 1991–2000 averages from SODA 1.4.2,2 CanCM3, and CanCM4. Several biases are evident in the models. Although peak current speeds approach those observed, the maxima in the models are too deep (125–130 m in the models as compared to about 105 m observed) and too far west (140°–150°W in the models vs about 130°W observed). In addition, the westward-flowing South Equatorial Current is too strong at the equator in the models, and the eastward North Equatorial Counter Current which peaks at about 50-m depth is too weak.

Annual mean Pacific zonal ocean velocity (left) as a function of depth and longitude at the equator and (right) as a function of depth and latitude at 140°W based on 1991–2000 averages from (a),(b) the SODA 1.4.2 ocean analysis; (c),(d) CanCM3; and (e),(f) CanCM4. Contour intervals are 10 cm s−1, with westward velocities shaded in (b),(d), and (f).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Annual mean Pacific zonal ocean velocity (left) as a function of depth and longitude at the equator and (right) as a function of depth and latitude at 140°W based on 1991–2000 averages from (a),(b) the SODA 1.4.2 ocean analysis; (c),(d) CanCM3; and (e),(f) CanCM4. Contour intervals are 10 cm s−1, with westward velocities shaded in (b),(d), and (f).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Annual mean Pacific zonal ocean velocity (left) as a function of depth and longitude at the equator and (right) as a function of depth and latitude at 140°W based on 1991–2000 averages from (a),(b) the SODA 1.4.2 ocean analysis; (c),(d) CanCM3; and (e),(f) CanCM4. Contour intervals are 10 cm s−1, with westward velocities shaded in (b),(d), and (f).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
The seasonally varying depth of the ocean surface mixed layer is another important global ocean property in relation to climate variability because it determines the effective ocean heat capacity governing the response of SST to atmospheric forcing (Saravanan and Chang 1999; Yu and Boer 2006). In the extratropics, ocean mixed layer depth (MLD) tends to be deepest in late winter, and shallowest in late summer. This seasonality is illustrated in Fig. 6, where MLD has been computed from the observational Polar Science Center Hydrographic Climatology (PHC)/World Ocean Atlas (WOA) climatology (Steele et al. 2001) and 1991–2000 model climatologies using the algorithm of Kara et al. (2000). In the winter hemispheres, local maxima of MLD corresponding to various mode water formation regions (Talley 1999) are generally well represented, although model MLDs in the Pacific subtropical and central mode water formation regions east of Japan are somewhat too deep, a common tendency in climate models (Lienert et al. 2011). A notable exception to this general agreement is that winter MLDs in the North Atlantic are represented rather poorly in the models. For example, the models show a zonally elongated region of shallow MLD at about 45°N that is not present in observations; this appears to be associated with the large surface cold bias in this region (Fig. 2) and coincident fresh bias (not shown), which in turn are attributable to errors in the path of the North Atlantic Current. In addition, the very deep winter mixed layers associated with deep water formation in the northwest Atlantic are displaced significantly southeastward in the models. In the summer hemispheres and in the tropics, the relatively shallow mixed layers are mainly wind driven and are represented reasonably well, although austral summer MLD in the Southern Ocean at around 60°S tends to be somewhat too shallow in the models.

Ocean mixed layer depths in (left) March and (right) September based on monthly mean temperature and salinity from the (top) PHC/WOA observational climatology and 1991–2000 model climatologies from historical runs of (middle) CanCM3 and (bottom) CanCM4, computed using the algorithm of Kara et al. (2000).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Ocean mixed layer depths in (left) March and (right) September based on monthly mean temperature and salinity from the (top) PHC/WOA observational climatology and 1991–2000 model climatologies from historical runs of (middle) CanCM3 and (bottom) CanCM4, computed using the algorithm of Kara et al. (2000).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Ocean mixed layer depths in (left) March and (right) September based on monthly mean temperature and salinity from the (top) PHC/WOA observational climatology and 1991–2000 model climatologies from historical runs of (middle) CanCM3 and (bottom) CanCM4, computed using the algorithm of Kara et al. (2000).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Volume transports associated with various aspects of ocean circulation provide further metrics for assessing the ocean component of climate models, and a standard set of such metrics has been developed for CMIP5 (e.g., Griffies et al. 2011). Table 4 lists values for a subset of the CMIP5 ocean transports as determined from 1981–2010 averages from one historically forced CanCM3 simulation and 10 historically forced CanCM4 simulations, in comparison with available observationally based values. The maximum transport associated with the North Atlantic meridional overturning circulation (AMOC), as represented by the streamfunction for zonally averaged flow in the Atlantic and including the parameterized effects of eddies, is about 18 Sv in both models (1 Sv ≡ 106 m3 s−1; no specific observation-based comparison is available for this case). A measure of the AMOC that is observationally constrained is transport due to the mean flow (i.e., excluding effective eddy transports) near 25°N. This is about 15 Sv in both models, which is about 3 Sv lower than the inverse model-based estimate of Ganachaud (2003) at 24°N, and also a 4-yr mean value from the Natural Environment Research Council (NERC) Rapid Climate Change (RAPID) monitoring array (Rayner et al. 2011). Both models have Atlantic Circumpolar Current transports through Drake Passage that are toward the high end of the range of mean values estimated by Cunningham et al. (2003). Modeled Indonesian Throughflow transports are somewhat stronger than the upper range of mean values estimated by Sprintall et al. (2009), whereas mean transports through Bering Strait are comparable to those estimated by Roach et al. (1995).
Modeled and observationally determined ocean volume transports.


2) Atmospheric climatology
The most widely used products of long-range prediction systems have historically been monthly or seasonal mean anomalies of near-surface (2 m) temperature and accumulated precipitation. Although model biases in these forecast fields can be corrected for as described at the beginning of this subsection, the magnitude of freely running model biases will affect the rate of forecast drift and possibly the quality of the forecasts themselves.
For atmospheric quantities, two sources of bias are (i) biases intrinsic to the AGCM that appear even when realistic SST and sea ice boundary conditions are prescribed, and (ii) coupled biases that arise from errors in the representation of these boundary conditions by the coupled model. Some intrinsic biases of CanAM3 are described in McFarlane et al. (2005) and Scinocca et al. (2008), whereas von Salzen et al. (2013) discusses biases (mainly relating to clouds and precipitation) of CanAM4.
Near-surface temperature biases in historically forced runs of CanCM3 and CanCM4, based on boreal winter [December–February (DJF)] and summer [June–August (JJA), hereafter all three-month periods are designated by the first letter of each respective month] averages for 1981–2010 with the 40-yr European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis (ERA-40) and ECMWF Re-Analysis Interim (ERA-Interim) (Dee et al. 2011) used to represent observations, are shown in Fig. 7. CanCM3 tends to exhibit cold biases over most landmasses, particularly over western North America and southern Asia in DJF and most extratropical land in JJA. By contrast, CanCM4 shows strong cold biases over Antarctica in DJF, warm biases over much of Asia in JJA, and warm biases over much of North America and the Amazon basin in both seasons. Biases over land in the two models largely reflect the behavior of their atmospheric components when observation-based SSTs are prescribed (not shown).

Near-surface temperature biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Near-surface temperature biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Near-surface temperature biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Over the oceans, pronounced midlatitude DJF warm biases immediately east of Asia and North America in both models also appear to originate in the atmospheric models. However, the low-latitude warm biases immediately west of Africa and South America are amplified by coupling particularly in CanCM3, and the strong DJF warm biases over the Southern Ocean in CanCM3 are not present in uncoupled CanAM3 simulations.
Modeled seasonal mean precipitation fields are compared with the Global Precipitation Climatology Project, version 2.1 (GPCP2.1), observational dataset (Adler et al. 2003) in Fig. 8, where the top row shows GPCP2.1 climatological means for DJF (left) and JJA (right) based on a 1981–2010 averaging period. In general, these large scale patterns are represented reasonably well in the models. However, there are some significant regional biases, shown in the remaining panels of Fig. 8. These include
Too little precipitation in tropical South America;
Too much precipitation in sub-Saharan Africa, south of the equator in DJF and north of the equator in JJA;
Too little DJF precipitation in the Middle East, particularly in CanCM4;
Too little DJF precipitation in northern Australia, particularly in CanCM3;
Too little monsoon-season (JJA) precipitation in southern and Southeast Asia;
Too little JJA precipitation in southern and eastern North America, particularly in CanCM4.

Observed climatological precipitation for (a) DJF and (b) JJA, based on the GPCP2.1 observational dataset for 1981–2010. Seasonal-mean precipitation bias in a CanCM3 historical run for (c) DJF and (d) JJA, relative to GPCP2.1 for 1981–2010. (e),(f) As in (c),(d), but for CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Observed climatological precipitation for (a) DJF and (b) JJA, based on the GPCP2.1 observational dataset for 1981–2010. Seasonal-mean precipitation bias in a CanCM3 historical run for (c) DJF and (d) JJA, relative to GPCP2.1 for 1981–2010. (e),(f) As in (c),(d), but for CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Observed climatological precipitation for (a) DJF and (b) JJA, based on the GPCP2.1 observational dataset for 1981–2010. Seasonal-mean precipitation bias in a CanCM3 historical run for (c) DJF and (d) JJA, relative to GPCP2.1 for 1981–2010. (e),(f) As in (c),(d), but for CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
CanCM3 and CanCM4 also exhibit a so-called split intertropical convergence zone (ITCZ) in the tropical Pacific, with a South Pacific convergence zone that is too zonal and extends too far eastward. This bias, which is common in climate models, is particularly severe in CanCM4. The split-ITCZ bias is not as pronounced in uncoupled CanAM3 and CanAM4, and thus appears to be attributable to errors in representing coupled feedbacks, although a general tendency for excessive tropical precipitation in atmospheric models may be a primary cause of such errors (Lin 2007).
Model biases in representing seasonally averaged mean sea level pressure are shown in Fig. 9. The magnitude of such biases is generally reduced in CanCM4 compared to CanCM3, particularly in the Southern Hemisphere. Biases over land are mostly similar to those in atmospheric model runs in which observational SSTs are applied (not shown). However, coupling induces some significant changes in the bias patterns, mainly over ocean regions. For example, the dipolar bias pattern over the northeast Atlantic in DJF, which is of some significance because of its influence on the North Atlantic storm track and European climate, is common in climate models (e.g., Hazeleger et al. 2011) but is intensified by coupling in both the CanSIPS and other models (e.g., Donner et al. 2011). This may be due to the influence of the strong localized cold bias in North Atlantic SST, evident in Fig. 2, on atmospheric circulation (Keeley et al. 2012). Other features apparently induced by coupling include the DJF high pressure bias in the subtropical North Pacific and the low pressure bias southeast of Australia in CanCM4.

Mean sea level pressure biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-40 and ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Mean sea level pressure biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-40 and ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Mean sea level pressure biases in historically forced runs of (a),(b) CanCM3 and (c),(d) CanCM4 averaging over boreal (a),(c) winters (DJF) and (b),(d) summers (JJA) in 1981–2010. Observational values are from ERA-40 and ERA-Interim for the same period.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
3) Sea ice climatology and trends
Global climate models such as CanCM3 and CanCM4 that simulate sea ice prognostically can potentially generate forecasts of sea ice conditions, although the utility of such forecasts has not yet been explored thoroughly. As is the case for other climate variables, the quality of the sea ice simulation in a freely running model is likely to bear on its predictive capability.
Sea ice extent, defined as the area over which sea ice concentration exceeds 0.15, undergoes large seasonal cycles in both hemispheres, with maximum extent typically occurring in late winter and minimum extent in late summer. Figure 10 compares the observed mean seasonal cycles during 2001–10 with corresponding values for CanCM3 and CanCM4. The winter maximum of Northern Hemisphere ice extent is too large in both models (top panel of Fig. 10), with too much ice in the Labrador, Greenland, Barents, and Okhotsk Seas, and too little in the Bering Sea (Figs. 11a,c). By contrast, late summer ice extent is too high in CanCM3 but too low in CanCM4, with corresponding ice concentration biases evident throughout much of the Arctic Ocean (Figs. 11b,d).

(top) Climatological mean seasonal cycle for 1981–2000 of Northern Hemisphere sea ice extent from the HadISST1.1 observational dataset and historical runs of CanCM3 and CanCM4. (bottom) Seasonal climatologies for Southern Hemisphere sea ice extent.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

(top) Climatological mean seasonal cycle for 1981–2000 of Northern Hemisphere sea ice extent from the HadISST1.1 observational dataset and historical runs of CanCM3 and CanCM4. (bottom) Seasonal climatologies for Southern Hemisphere sea ice extent.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
(top) Climatological mean seasonal cycle for 1981–2000 of Northern Hemisphere sea ice extent from the HadISST1.1 observational dataset and historical runs of CanCM3 and CanCM4. (bottom) Seasonal climatologies for Southern Hemisphere sea ice extent.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Climatological sea ice concentration biases for CanCM3 in (a) March and (b) September, relative to HadISST1.1 for 2001–10. Similarly computed biases for CanCM4 in (c) March and (d) September.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Climatological sea ice concentration biases for CanCM3 in (a) March and (b) September, relative to HadISST1.1 for 2001–10. Similarly computed biases for CanCM4 in (c) March and (d) September.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Climatological sea ice concentration biases for CanCM3 in (a) March and (b) September, relative to HadISST1.1 for 2001–10. Similarly computed biases for CanCM4 in (c) March and (d) September.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Northern Hemisphere sea ice extent has undergone a steady decline, particularly in its annual minimum, since continuous satellite monitoring began in 1979. Figure 12 shows time series of annual maximum (March) and minimum (September) ice extent beginning in 1979, from observations and historical runs of the two models (the model time series extend to 2020). Also shown are corresponding linear trends for the 1979–2011 period. March trends for both models are similar to the observed trend, despite the unrealistically high March ice extents in the models. The trend in September extent in CanCM4 is similar to the observed trend, whereas it is markedly weaker in CanCM3.

Time series, dating from the start of the satellite record in 1979, of monthly mean Northern Hemisphere sea ice extent in March and Septemeber from HadISST1.1 (dashed line), as well as historical runs of CanCM3 (light solid line) and CanCM4 (dark solid line). Corresponding linear trends for 1979–2011 are indicated by the straight lines.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Time series, dating from the start of the satellite record in 1979, of monthly mean Northern Hemisphere sea ice extent in March and Septemeber from HadISST1.1 (dashed line), as well as historical runs of CanCM3 (light solid line) and CanCM4 (dark solid line). Corresponding linear trends for 1979–2011 are indicated by the straight lines.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Time series, dating from the start of the satellite record in 1979, of monthly mean Northern Hemisphere sea ice extent in March and Septemeber from HadISST1.1 (dashed line), as well as historical runs of CanCM3 (light solid line) and CanCM4 (dark solid line). Corresponding linear trends for 1979–2011 are indicated by the straight lines.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
In the Southern Hemisphere, both models have too much sea ice in austral winter, and too little in austral summer (Fig. 10, bottom panel). The latter bias is more severe in CanCM3 than in CanCM4, reflecting the Southern Ocean warm bias in CanCM3, which is largest in austral summer, whereas in austral winter both models have a cold SST bias in this region (not shown). Such an apparent correspondence between biases in Antarctic sea ice extent and Southern Ocean SST is evident in other current-generation climate models as well (Griffies et al. 2011; Sterl et al. 2012; Landrum et al. 2012).
c. Climate variability
The ability of dynamical models to forecast climate on subseasonal, seasonal, and longer time scales with some skill rests largely on their ability to represent potentially predictable climate anomalies when integrated forward from observationally constrained initial states. It is clearly desirable, therefore, that forecast models be able to represent potentially predictable climate phenomena when not constrained by observational data. This subsection examines CanSIPS model representations of two such phenomena: ENSO, which is a major source of predictability on seasonal to multiseasonal time scales, and the Madden–Julian oscillation (MJO), which imparts predictability on subseasonal time scales. Although both of these phenomena originate in the tropics they have wider-ranging influences; therefore, associated global teleconnections are considered as well.
1) ENSO variability
Versions of the CCCma coupled model preceding those used in CanSIPS exhibited ENSO variability that was much too weak and tended to be concentrated unrealistically in the central equatorial Pacific (e.g., Merryfield 2006). Modeled ENSO variability has, however, become considerably more realistic with the introduction of improved ocean and atmospheric model components in CanCM3 and CanCM4.
ENSO variability is commonly described through anomalies of the Niño-3.4 index, defined as average SST in a region bounded by 5°S–5°N, 120°–170°W. Positive Niño-3.4 anomalies of sufficient magnitude and duration are indicative of El Niño events, whereas negative anomalies are similarly indicative of La Niña events.
Figure 13 (top) shows a 50-yr time series (1961–2010) of detrended monthly Niño-3.4 anomalies as determined from the Hadley Center Sea Ice and SST, version 1.1 (HadISST1.1), observational dataset, considered here because of its relatively long temporal coverage and the fact that ENSO variability undergoes considerable interdecadal variation. Evident in this record are El Niño and La Niña events that occur 2–3 times per decade on average, with occasional particularly large events such as the El Niños of 1982/83 and 1997/98. Comparable time series for CanCM3 and CanCM4 are shown in the middle and bottom panels, respectively. Corresponding power spectra plotted in a variance preserving format are shown in Fig. 14, where shading indicates the ensemble standard deviation for the 10 CanCM4 runs. Some statistics of these time series are summarized in Table 5; these include a characteristic period defined as the period at which the power spectral variance is bisected (Merryfield 2006).

Monthly mean Niño-3.4 index anomalies for 1961–2010 from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Monthly mean Niño-3.4 index anomalies for 1961–2010 from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Monthly mean Niño-3.4 index anomalies for 1961–2010 from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Power spectra in variance preserving format of the 1961–2010 monthly Niño-3.4 index from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation). Split-cosine tapering was applied to the first and last 10% of time series, and spectra were smoothed using a 12-bin Parzen window.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Power spectra in variance preserving format of the 1961–2010 monthly Niño-3.4 index from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation). Split-cosine tapering was applied to the first and last 10% of time series, and spectra were smoothed using a 12-bin Parzen window.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Power spectra in variance preserving format of the 1961–2010 monthly Niño-3.4 index from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation). Split-cosine tapering was applied to the first and last 10% of time series, and spectra were smoothed using a 12-bin Parzen window.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Observational and model statistics of detrended monthly Niño-3.4 index for 1961–2010.


ENSO variability in CanCM3, although stronger than in previous model versions, is still somewhat weaker than observed, with a standard deviation of 0.54 for this period as compared to 0.84 for HadISST. CanCM3 variability is also too rapid, with a characteristic period of slightly over 2 yr as compared with the observed value of 3.27 yr. By contrast, ENSO in CanCM4 is more realistic both in terms of its amplitude, which is slightly too large on average, and its characteristic period, which is approximately correct. Also evident from the power spectra are that CanCM3 has far too little Niño-3.4 variability on decadal to multidecadal time scales, whereas CanCM4 has slightly too much. Another statistic in Table 5 is skewness: observed ENSO SST anomalies tend to have positive skewness in the Niño-3.4 region, with El Niño anomalies stronger than La Niña anomalies (Burgers and Stephenson 1999). However, Niño-3.4 skewness is negative in both models. Modeled skewness remains negative farther east in the Niño-3 region bounded by 5°S–5°N, 90°–150°W (not shown), in contrast to observed positive values, which increase monotonically toward South America (e.g., Monahan and Dai 2004).
An important aspect of ENSO variability is its seasonality, indicated in Fig. 15 by the standard deviation of the Niño-3.4 index as a function of calendar month. Observed ENSO variability (dashed) tends to be strongest in boreal winter, peaking in December, and weakest in late spring. These tendencies are not always represented realistically in climate models (Joseph and Nigam 2006). However, ENSO seasonality is approximately correct in CanCM3 and CanCM4 (light and dark solid lines, respectively), with the amplitude biases persisting in all calendar months, although the seasonal minimum in ENSO activity occurs about one month too late (in May instead of April) in both models.

Seasonal cycle of monthly Niño-3.4 index standard deviation, based on 1961–2010 time series from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Seasonal cycle of monthly Niño-3.4 index standard deviation, based on 1961–2010 time series from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Seasonal cycle of monthly Niño-3.4 index standard deviation, based on 1961–2010 time series from HadISST1.1 (dashed line), one CanCM3 historical run (light solid line), and 10 CanCM4 historical runs (dark solid line, with shading representing the ensemble standard deviation).
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
The spatial pattern and regional intensity of ENSO SST variability are indicated in Fig. 16 by the standard deviation of monthly SST anomalies throughout the tropical Pacific. The model biases in ENSO amplitude (too weak in CanCM3 and too strong in CanCM4) are again evident. In addition, there are further biases in the patterns of near-equatorial SST variability. For example, the meridional extent of modeled variability is too narrow, particularly in CanCM3, and the SST anomalies extend too far westward, nearly to the Maritime Continent, in both models. These biases have each been found to occur frequently in other coupled models (Joseph and Nigam 2006).

Standard deviations of 1961–2010 monthly SST anomalies from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Standard deviations of 1961–2010 monthly SST anomalies from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Standard deviations of 1961–2010 monthly SST anomalies from (top) HadISST1.1 and historical runs of (middle) CanCM3 and (bottom) CanCM4.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
2) ENSO teleconnections
An important attribute of seasonal forecast models is their ability to represent ENSO teleconnections, since the global influence of ENSO plays a substantial role in shaping seasonal climate anomalies. In North America, these influences tend to be strongest in boreal winter (e.g., Ropelewski and Halpert 1986; Shabbar et al. 1997), although ENSO influences on western Canadian temperatures persist into early spring (Shabbar 2006). In addition, precipitation over the Great Basin and possibly the high plains of the United States may be influenced in summer (Ropelewski and Halpert 1986), although the strength of this signal varies across different analyses (Wang et al. 2007).
Observed and modeled ENSO teleconnections are assessed here through linear regressions of climate variables with the Niño-3.4 index as for example in Yang and DelSole (2012).3 This approach assumes linearity implying opposite responses to El Niño and La Niña, whereas that approximation is known to be violated to varying degrees in certain regions (e.g., Wu and Hsieh 2004; Wang et al. 2007). Observed teleconnections are diagnosed from the ERA-40 and ERA-Interim reanalyses for surface temperature and pressure, and the Global Precipitation Climatology Project, version 2.2 (GPCP2.2), dataset for precipitation, for the years 1981–2010, which corresponds to the CanSIPS hindcast period, whereas modeled teleconnections are diagnosed for the same period in CanCM3 and CanCM4 historical runs. Field significance is assessed very simply by requiring that the local correlation coefficient exceed 0.3. Assuming temporal independence in the 30-yr sample this implies p ≲ 0.1 or >90% confidence for a two-tailed test. Such a local significance test does not account for the finite extent and hence spatial correlation of teleconnection patterns, and so may tend both to underestimate the spatial coherence and extent of true teleconnections and produce occasional small-scale “false positives.” A more sophisticated field significance test that takes spatial correlations into account is described in DelSole and Yang (2011) and Yang and DelSole (2012).
ENSO-teleconnected regression patterns for boreal winter (DJF) are shown in Fig. 17. Surface temperature and mean sea level pressure (MSLP) anomalies are indicated by colors and contours, respectively, in the left-hand panels, and standardized precipitation anomalies are shown in the right-hand panels. Values where the local correlation with the Niño-3.4 index are <0.3 are not shown for reasons indicated above. For CanCM4, regression and correlation values averaged over the ensemble of 10 historical runs are used. Although the 0.3 correlation threshold is maintained despite the larger sample size to maintain similarity with the observational and CanCM3 maps, confidence in the CanCM4 patterns can be considered higher than for the other cases.

Regressions against DJF averages of the Niño-3.4 index of DJF averaged mean sea level pressure (contours), (a),(c),(e) near-surface temperature (colors) and (b),(d),(f) standardized precipitation anomalies (colors) for 1981–2010 from (a),(b) observations; (c),(d) one CanCM3 historical run; and (e),(f) 10 CanCM4 historical runs. Values for which the correlations are <0.3 (two-tailed p ≳ 0.1) are not plotted. Observational data sources are ERA-40 and ERA-Interim for pressure and temperature, and GPCP2.2 for precipitation.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Regressions against DJF averages of the Niño-3.4 index of DJF averaged mean sea level pressure (contours), (a),(c),(e) near-surface temperature (colors) and (b),(d),(f) standardized precipitation anomalies (colors) for 1981–2010 from (a),(b) observations; (c),(d) one CanCM3 historical run; and (e),(f) 10 CanCM4 historical runs. Values for which the correlations are <0.3 (two-tailed p ≳ 0.1) are not plotted. Observational data sources are ERA-40 and ERA-Interim for pressure and temperature, and GPCP2.2 for precipitation.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Regressions against DJF averages of the Niño-3.4 index of DJF averaged mean sea level pressure (contours), (a),(c),(e) near-surface temperature (colors) and (b),(d),(f) standardized precipitation anomalies (colors) for 1981–2010 from (a),(b) observations; (c),(d) one CanCM3 historical run; and (e),(f) 10 CanCM4 historical runs. Values for which the correlations are <0.3 (two-tailed p ≳ 0.1) are not plotted. Observational data sources are ERA-40 and ERA-Interim for pressure and temperature, and GPCP2.2 for precipitation.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Focusing first on the tropical oceans, the equatorial Pacific warm anomalies associated with El Niño in both models extend too far to the west as implied by the SST variability patterns in Fig. 16, whereas modeled warm anomalies in the tropical Indian and Atlantic Oceans occur more or less as in observations. The precipitation response in the tropical Pacific and eastern Indian Oceans is particularly important because its accuracy in models is a major factor determining the fidelity of extratropical teleconnections which are forced largely by tropical diabatic heating anomalies (AchutaRao and Sperber 2006, and references therein). Both models correctly show that a strong precipitation increase in the central and eastern equatorial Pacific occurs with El Niño events. In observations, associated diabatic heating anomalies occurring near the date line are largely responsible for extratropical teleconnections (e.g., Trenberth et al. 1998). However, in CanCM3 and CanCM4 this region of increased precipitation extends too far westward, a common model bias that may in turn lead to westward displacements in extratropical teleconnections (Wittenberg et al. 2006).
The dominant feature in the MSLP regression patterns, other than the tropical east–west dipole associated with the Southern Oscillation (present but not clearly evident in Fig. 17 as a result of the choice of contour levels), is a pronounced deepening of the Aleutian low associated with El Niño. This feature, which is connected to shifts in circulation including the North Pacific storm track (Ren et al. 2007), is modeled reasonably realistically, although its center occurs somewhat too far west on average in the 10 CanCM4 historical runs.
Pronounced and extensive features occur in the DJF surface temperature regression patterns over all continental landmasses except Eurasia and Antarctica. [A cold feature in northeastern Asia identified by Yang and DelSole (2012) in observations is weakly evident in the present analysis.] An observed warm anomaly in northwestern North America associated with El Niño is evident in both models, although its extension into central North America in observations and CanCM3 is not evident in CanCM4. The extensive cold anomaly in the southeastern United States and northern Mexico in CanCM4 is in good agreement with that deduced from observations by DelSole and Yang (2011), although it is less extensive in the current analyses of observations and CanCM3. This feature does, however, become more prominent in March–May (MAM), while shifting westward in observations and both models; similarly the aforementioned warm anomalies remain in MAM but become confined to northwestern North America (not shown).
The primary North American signals in the DJF precipitation regression patterns, most evident in observations and CanCM4, are an El Niño–related precipitation increase across southern North America, accompanied by decreases in the Pacific Northwest and Great Lakes regions. These signals are in general agreement with analyses by Shabbar (2006) and Yang and DelSole (2012).
Other DJF teleconnections over land common to differing degrees in observations and both models are a general El Niño–linked warming over tropical landmasses and Australia that tends to be stronger in the models, reduced precipitation in southern Central America and northern South America accompanied by wetter conditions to the southeast, dry conditions over the Maritime Continent and Southeast Asia, and enhanced precipitation in southwest Asia, the Middle East, and the Horn of Africa. A strong El Niño–linked tendency toward drought in southern Africa is only weakly evident in the models, whereas some features in the models such as cooling in eastern China in CanCM3 and enhanced precipitation in West Africa in both models are not evident in observations.
Teleconnection patterns in the boreal summer months of JJA (Fig. 18) tend to be less distinct than in DJF, likely because ENSO itself is weaker in these months (Fig. 15). (The regression patterns for CanCM3 are particularly “noisy” because of its comparatively weak ENSO.) For temperature there is little commonality between observed and modeled patterns other than warming tendencies in parts of northern Africa, southern Asia, and equatorial South America, much as found by Yang and DelSole (2012) who compared observed patterns with those from coupled hindcasts.4

As in Fig. 17, but for JJA.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

As in Fig. 17, but for JJA.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
As in Fig. 17, but for JJA.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
JJA precipitation signals over land include dry conditions over the Maritime Continent, the Horn of Africa, Central America, and northern South America (not clearly evident in CanCM3), along with decreased rainfall in the Indian summer monsoon region. There is also increased rainfall in the western United States in all three cases although in differing regions. Such a signal, though identified in previous analyses (Ropelewski and Halpert 1986, 1987) was found not to be statistically significant by Yang and DelSole (2012). Some ENSO impacts on boreal summer rainfall that are evident in Fig. 18b and other analyses are not captured by CanCM3 or CanCM4. These include El Niño–linked dry conditions in the Sahel region of Africa and eastern Australia.
3) Interannual variability of seasonal means
A further potential bias in climate prediction models, in addition to biases in representing the climatological mean and seasonal cycle, is erroneous representation of the level of interannual variability as has been noted above for ENSO-related SST variability in CanCM3 and CanCM4. Such biases should be addressed for example when combining forecasts from different models, because models that have unrealistically large interannual climate anomalies will otherwise tend to dominate multimodel ensemble averages unless some correction is applied (e.g., Kharin et al. 2009).
Figure 19 shows the ratio of modeled to observed standard deviations of seasonal mean (DJF and JJA) near-surface temperature anomalies for the 1981–2010 CanSIPS hindcast period. As in other instances these results are based on a single CanCM3 historical run and 10 CanCM4 historical runs. (Standard deviations for CanCM4 are computed as the mean of the standard deviations for the individual runs.) It is evident that interannual variability in CanCM4 is generally larger than in CanCM3, likely as a result of the difference in ENSO amplitudes. In particular, CanCM3 variability tends to be somewhat weaker than observed in most ocean regions whereas such variability in CanCM4 is slightly too large. A location of excessive ocean variability common to both models is the far western equatorial Pacific, a result of the unrealistic westward extension of ENSO SST anomalies illustrated in Fig. 16.

Ratios of modeled to observed standard deviations of seasonal mean near-surface temperatures in 1981–2010 for (a) CanCM3 in DJF, (b) CanCM3 in JJA, (c) CanCM4 in DJF, and (d) CanCM4 in JJA. Observed temperatures are from the ERA-40 and ERA-Interim.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Ratios of modeled to observed standard deviations of seasonal mean near-surface temperatures in 1981–2010 for (a) CanCM3 in DJF, (b) CanCM3 in JJA, (c) CanCM4 in DJF, and (d) CanCM4 in JJA. Observed temperatures are from the ERA-40 and ERA-Interim.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Ratios of modeled to observed standard deviations of seasonal mean near-surface temperatures in 1981–2010 for (a) CanCM3 in DJF, (b) CanCM3 in JJA, (c) CanCM4 in DJF, and (d) CanCM4 in JJA. Observed temperatures are from the ERA-40 and ERA-Interim.
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Other biases in variability include CanCM4 having far too much variability in tropical South America, which is likely a consequence of the severe dry bias evident in Fig. 8 combined with the modest excess in ENSO variability. Variability in South Asia, central Asia, and much of North America is also too high in both models, particularly in JJA.
Biases in the interannual variability of precipitation as measured by the ratio of standard deviations tend to scale with biases in the mean, which are considerable in some locations in fractional terms (Fig. 8). Considering instead the ratio of modeled to observed standard deviations scaled by their respective means [i.e.,

Ratio of modeled coefficient of variation of seasonal mean precipitation in 1981–2010 (i.e., standard deviation scaled by the mean,
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1

Ratio of modeled coefficient of variation of seasonal mean precipitation in 1981–2010 (i.e., standard deviation scaled by the mean,
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
Ratio of modeled coefficient of variation of seasonal mean precipitation in 1981–2010 (i.e., standard deviation scaled by the mean,
Citation: Monthly Weather Review 141, 8; 10.1175/MWR-D-12-00216.1
4) MJO variability
The MJO is characterized by a coherent, eastward-propagating concentration of tropical deep convection that is most prominent in the Indo–Pacific and has a characteristic period of 30–80 days (Wheeler and Kiladis 1999). In addition to its considerable direct influence on tropical climate, diabatic heating anomalies associated with the MJO give rise to Rossby wave trains that influence extratropical climate in both hemispheres (e.g., Matthews et al. 2004; Donald et al. 2006). These influences include the lagged modulation of North American temperature and precipitation particularly in the boreal winter months (Mo and Higgins 1998; Jones 2000; Bond and Vecchi 2003; Vecchi and Bond 2004; Lin and Brunet 2009; Lin et al. 2010; Zhou et al. 2012) and a two-way connection with the North Atlantic Oscillation (Lin et al. 2009). In addition, the MJO is associated with westerly wind bursts that can contribute to the growth or termination of ENSO events (Seiki and Takayabu 2007).
The MJO is thus a potential source of subseasonal and possibly seasonal (through its influence on ENSO) climate predictability. Such predictability stems in part from the lagged influence of the initial state of the tropical Indo–Pacific atmosphere on other regions, but potentially also from any forecast skill in predicting the future evolution of the MJO itself. As for ENSO, the ability of a model to represent MJO variability when not constrained by observations is likely to bear on its ability to predict the MJO. However, the MJO has not been found to be simulated realistically in most climate models (Lin et al. 2006), suggesting a need for improved parameterizations of deep convection and perhaps increased resolution in such models.
A comprehensive set of diagnostics for describing MJO variability in climate simulations has been developed by the Climate Variability and Predictability (CLIVAR) MJO Working Group (Waliser et al. 2009) and applied to a variety of climate models (Kim et al. 2009). These diagnostics have been applied to 50-yr time series from freely running CanCM3 and CanCM4 simulations, enabling aspects of MJO variability to be quantified relative to the observed MJO and other models. A small subset of these diagnostics that provides a general view of the modeled MJO variability is described here. Figure 21 shows the 20–100-day filtered variance of wintertime (November–April) tropical outgoing longwave radiation (OLR), which provides a measure of the intraseasonal variability of deep convection. The top panel shows the variability of OLR measurements from the Advanced Very High Resolution Radiometer (AVHRR). By comparison, intraseasonal OLR variability in CanCM3 (middle panel, note difference in scale) is much too weak, and there are significant errors in its distribution with too much variability in the western Indian Ocean and western Pacific relative to the central Indian Ocean and Maritime Continent (pattern correlation 0.71). These biases may be related to a tendency for large-scale stratiform precipitation to supplant deep convective tropical precipitation in CanCM3 (Scinocca and McFarlane 2004). The magnitude of intraseasonal OLR variability is more realistic in CanCM4 although still somewhat too weak (bottom panel), whereas its spatial distribution, with a pattern correlation of 0.86, is considerably more realistic than in CanCM3.