1. Introduction
The growth of systematic errors in general circulation models (GCMs) remains one of the central problems in producing accurate predictions of climate change for the next 50 to 100 years (Randall et al. 2007). Although great advances in global climate modeling have been made in recent decades (Solomon et al. 2007), there are still large uncertainties in many processes such as clouds, convection, and coupling to the oceans and the land surface (e.g., Cubasch et al. 2001; Koster et al. 2004). There is obviously great interest in predictions of extreme events such as tropical cyclones and how they will change because of anthropogenic forcing. However, tropical prediction remains a particular challenge. Major modes of tropical variability, such as the Madden–Julian oscillation (MJO; Zhang 2005) and El Niño–Southern Oscillation (ENSO; Allan et al. 1996) are still relatively poorly understood and often poorly modeled (e.g., Slingo et al. 1996; Sperber and Palmer 1996; Van Oldenborgh et al. 2005; Guilyardi et al. 2009) On shorter time scales, the diurnal cycle of convection over tropical land is another key weakness in models (Betts and Jakob 2002; Yang and Slingo 2001).
One issue impeding progress is that attributing the growth of systematic errors to the modeling of particular physical processes is notoriously difficult in climate GCMs. This is due to both a lack of observational data and to the nonlinear interactions amongst various physical processes and the errors in modeling them. A possible way forward is to use short-range (1–5 day) forecasts from a numerical weather prediction (NWP) framework as a means of evaluating parameterizations in climate models (Phillips et al. 2004). This has a number of advantages. First, the NWP forecasts are run from initial states generated with state-of-the-art variational data assimilation (e.g., Lorenc et al. 2000; Rawlins et al. 2007). This means that the errors in the large-scale synoptic flow are minimized and there are no large biases in the circulation due to remote forcing effects (e.g., tropical–extratropical interactions). Such biases in circulation in a climate model make it difficult to determine if the parameterized physical processes are performing poorly because of errors in their formulation or errors in the inputs to the parameterizations themselves. This approach was taken to the extreme limit of examining one-time-step forecasts by Klinker and Sardesmukh (1992) to identify errors in the momentum balance of the European Centre for Medium-Range Weather Forecasts (ECMWF) forecasts and by Rodwell and Palmer (2007) to quantify the uncertainty in climate change forecasts due to model error. Second, detailed up-to-date observational datasets [e.g., Atmospheric Radiation Measurement Program (ARM; Ackerman and Stokes 2003) and Tropical Ocean and Global Atmosphere Coupled Ocean–Atmosphere Response Experiment (TOGA COARE; Webster and Lucas 1992)] can be used to evaluate the individual physical processes at the temporal and spatial scales of individual weather systems.
Conversely, seasonal and climate coupled modeling provide a very strong constraint on the veracity of the global model formulation. The physics–dynamics developed must perform in the coupled ocean–atmosphere–cryosphere environment without systematic drifts [e.g., in sea surface temperatures (SSTs)]. The longer-time-scale integrations also allow an evaluation of the model performance in modes of low-frequency variability such as ENSO, the quasi-biennial oscillation (QBO; Scaife et al. 2000), and the North Atlantic Oscillation (NAO; Hurrell and van Loon 1997). As Hurrell et al. (2009) point out, using similar models for predictions on different time scales can result in improved skill in both weather and climate forecasts through stronger collaboration and shared knowledge among those in the NWP and climate communities working on parameterizations schemes. The Met Office Unified Modeling system (Cullen 1993) has benefitted from the synergy between NWP and climate modeling in many past model developments, including a new boundary layer turbulent mixing scheme (Lock et al. 2000; Martin et al. 2000), orographic parameterizations (Milton and Wilson 1996; Gregory et al. 1998), and most recently in development of the new semi-implicit, semi-Lagrangian dynamical core (Davies et al. 2005; Martin et al. 2006; Ringer et al. 2006).
The current operational global NWP and climate configurations of the Met Office Unified Model (MetUM) have very similar dynamical and physical formulations (see section 2). Comparison of zonal mean temperature and zonal wind cross sections (Fig. 1) shows that the models have similar systematic errors on the largest scales, despite differences in both horizontal resolution and prediction time scale. This similarity provides the opportunity to tackle such errors through a joint model development approach. The seasonal and decadal modeling frameworks at the Met Office have, until recently, differed from the configuration used for weather and climate time scales. However, new frameworks for seasonal and decadal modeling are currently under development and will enable the development of systematic errors on the monthly-to-decadal time scale to be studied. Ultimately it is hoped that the whole suite of MetUM models will benefit from improvements achieved in this way.
In this paper we focus on two particular areas of concern in the MetUM on climate and NWP time scales: (i) tropical performance, in particular excessive precipitation and evaporation over tropical oceans and related circulation errors, and (ii) summer land surface temperature and moisture biases over northern continents. The paper is arranged as follows: section 2 outlines the model formulations used in this study, section 3 discusses the analysis and reduction of the tropical biases, and section 4 concentrates on the Northern Hemisphere summer land surface temperature and moisture biases. The impact of this work on the overall performance of the models is described in section 5, and conclusions are drawn in section 6.
2. Model configuration and experimental details
The MetUM forms the basis for weather and climate prediction across a wide range of spatial and temporal scales. In this study we focus attention on three prediction time scales. The first time scale is the short range (up to 6 days). The operational deterministic global NWP forecast is run twice a day, from 0000 and 1200 UTC analyses. The analyses are produced using four-dimensional variational data assimilation (Rawlins et al. 2007), with SSTs and sea ice initialized once a day during the assimilation cycle then held fixed through the course of the 6-day forecasts. In this study a number of recent and past global NWP model cycles (Table 1) are discussed in the context of improvements to tropical performance and Northern Hemisphere land surface temperature and moisture biases. Further details of the operational global NWP model cycles are given in Allan et al. (2005, 2007). In addition, 5-day predictions were also run with the climate model configuration initialized from weekly analyses through four recent boreal winter and summer seasons (2001 to 2004). This uses the climate model within the NWP framework following a similar strategy to Phillips et al. (2004).
The second time scale is the medium range (up to 15 days). Since 2006, the Met Office Global and Regional Ensemble Prediction System (MOGREPS; Bowler et al. 2007a,b) has been running with a 24-member global ensemble to 6 days ahead on a daily basis. In addition, ensembles of 15-day forecast runs (MOGREPS-15) are run daily at ECMWF as part of The Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble (TIGGE) program (Titley et al. 2008). Here we utilize results from a series of 15-day deterministic forecasts run as a test-bed suite for evaluating new physics parameterization developments prior to implementation in the MOGREPS-15 production suite (Savage et al. 2006). The 15-day deterministic forecasts are initialized from Met Office 1200 UTC global operational analyses and SSTs are held fixed throughout the 15 days of the forecasts.
The final time scale is the climate predictions. The climate configuration of the MetUM, Hadley Centre Global Environmental Model version 1 (HadGEM1), is described by Martin et al. (2006) and Johns et al. (2006). This version forms the control over which we measure the improvement in systematic errors of the tropical circulation and summer land surface temperatures and moisture as seen in the final configuration of a new climate version, HadGEM2-AO (Collins et al. 2008). Analysis of the systematic errors in the climate model are carried out using both atmosphere-only runs (HadGAM1; Table 1) and coupled runs (HadGEM1 and HadGEM2-AO; Table 1). The atmosphere-only runs were forced by prescribed SSTs and sea ice from the second Atmospheric Model Intercomparison Project (AMIP-II; Gates et al. 1999) from 1979 to 1999 and using other boundary forcing as described in Martin et al. (2006). The coupled runs were run in persistent preindustrial mode using fixed 1860 forcing levels for greenhouse gases, ozone, sulfur, and other aerosol precursor emissions and land surface boundary conditions [see Johns et al. (2006) for further details].
The model results are compared against a range of observations and (re)analyses, depending on the relevant time scale. In general, the short-range and medium-range predictions are compared against either Met Office analyses or radiosonde ascents, while the climate predictions are compared against ECMWF reanalyses (Uppala et al. 2005). The similarity between the biases in the different configurations compared against the different verification datasets provides confidence that the results are robust.
3. Tropical performance
a. Analysis of systematic errors on NWP and climate time scales
At NWP time scales there is excessive precipitation and evaporation over tropical oceans. These errors have implications for the water and energy cycles, tropical cyclones, and aviation products in the short range, while on medium ranges, tropical errors can impact extratropical forecasts through teleconnections. As discussed by Johns et al. (2006), the simulation of ENSO is a major weakness of HadGEM1. The climatological trade winds are too strong in the east Pacific, and the associated excessive zonal wind stress in the equatorial region drives excessive upwelling across much of the tropical Pacific. In consequence, HadGEM1 exhibits a marked cold bias in the equatorial Pacific (Johns et al. 2006, their Fig. 3). Johns et al. (2006) showed that the observed eastward shift of the tropical convection during El Niño events, associated with a collapse of the Walker circulation, is not captured in HadGEM1, but that HadGAM1 (with prescribed SSTs) does reproduce the observed features in a more satisfactory manner, suggesting that the problems in HadGEM1 are probably due to the combined interactions of atmosphere and ocean.
The excessive low-level zonal wind is clearly seen in atmosphere-only runs on both climate and 1–5-day forecast time scales (Figs. 2b,a respectively). Thus, the source of the wind stress bias in coupled atmosphere–ocean runs of HadGEM1 appears to be the atmosphere model. The zonal wind error is seen to grow steadily in short-range forecasts from analysis time (Fig. 2a) and appears to have saturated at day 5 to similar levels (1.5 m s−1) as seen on climate time scales. This suggests that the error occurs through an immediate response of the model’s physical parameterizations or dynamics and subsequently persists into equilibrium.
As discussed in the introduction, both climate and NWP configurations exhibit similar tropical systematic biases in the thermodynamic fields. We analyze this error growth in further detail by comparing temperature and moisture profiles in MetUM analyses and 36-h forecasts against radiosonde observations from the ARM Manus Island region (Ackerman and Stokes 2003). The cold bias at upper tropospheric levels (14–26 km) is already present even in the analysis,1 while the midtropospheric warm bias steadily evolves over the first 36 h of the forecast (Fig. 3). The relative humidity (RH) profiles show a number of interesting discrepancies between analyses, forecasts, and radiosondes. Above the freezing level (about 5 km) the analysis is moister than the radiosondes by around 10%, which perhaps could be argued to be within observational uncertainty of measuring RH in moist tropical atmospheres (Ciesielski et al. 2003) However, the overall structure of decreasing RH with height is consistent with the radiosondes. More interesting is the evolution of the RH in the 30–36-h forecasts, which moisten by 5%–10% relative to the analysis between 15 and 20 km and dry between 10 and 15 km. Although some of this RH drift is correlated with the warm–cold bias dipole, similar drying–moistening is also seen in the specific humidity fields themselves (not shown).
One hypothesis for this structural change in the humidity profile is that the model’s convective parameterization is detraining too little moisture in the mid to upper troposphere and too much once the convective parcel finally terminates and detrains near the tropopause. The model/analysis hydropause is also too low, which suggests that parameterized convection does not penetrate high enough compared with the radiosondes. This particular error was also noted in the studies of the MetUM convective parameterization against TOGA COARE data (Willett et al. 2008) and in comparisons of modeled tropical brightness temperatures with satellite data (Milton et al. 2001).
The modeled tropical precipitation on both medium-range (Figs. 4e,f) and climate (Figs. 4c,d) time scales shows positive biases over the tropical oceans compared with Global Precipitation Climatology Project (GPCP) (Huffman et al. 2001) and Climate Prediction Center (CPC) Merged Analysis of Precipitation (CMAP) (Xie and Arkin 1997) monthly precipitation datasets, with the model biases generally larger than the uncertainty between the observational estimates (Fig. 4b). These errors are consistent with excessive diabatic heating of the tropical atmosphere and the growth of the midtroposphere tropical warm bias discussed above. Although similar on the largest scales, the detailed patterns of error on medium-range and climate time scales are subtly different. This may be related to differences in the initialization as well as horizontal resolution.
The incorrect distribution of tropical diabatic heating will clearly feed back onto the tropical circulation. For example, idealized modeling experiments by Hartmann et al. (1984) showed strong sensitivity of the Walker circulation and low-level winds to the vertical distribution of diabatic heating associated with mature cloud clusters (maximum in upper troposphere) compared with more conventional heating profiles peaking in the midtroposphere. The 200-hPa velocity potential (Fig. 5) shows excessive divergent outflow over the Indian Ocean at both climate and medium-range time scales, consistent with the excessive precipitation in these regions (Fig. 4). At climate time scales, there are also two regions of excess subsidence linked to the divergence errors via a north–south Hadley circulation: one over Australia and a second over the east Pacific, while the medium-range forecast shows excess convergence right across the Pacific and into the tropical Atlantic. It appears that, on the medium-range time scale, the errors project more onto the east–west Walker circulation than the north–south Hadley circulation. The 200-hPa streamfunction analysis fields (Figs. 6a,d) are dominated by the twin anticyclones of the mean Asian monsoon circulation, straddling the equator with strong easterly flow of the tropical equatorial jet (TEJ) between them. The broad-scale hemispheric biases at both medium-range and climate time scales are similar, with the MetUM showing a tendency to weaken these anticyclones and produce erroneous westerly flow in the Indian Ocean, weakening the TEJ. This excessive upper-level westerly flow continues into the Pacific and is consistent with too strong a Walker circulation and too strong easterly flow in the near surface equatorial winds (Fig. 2).
In the following section we discuss some physical improvements to the MetUM convective and boundary layer parameterizations that have an impact on the equatorial wind stress, precipitation, diabatic heating, and overall biases in tropical circulation.
b. Testing proposed solutions—Improvements to physical parameterizations
There has been growing evidence over a number of years that the MetUM convective parameterization (described in Gregory and Rowntree 1990; Gregory and Allen 1991; Gregory et al. 1997) suffered from a number of structural deficiencies highlighted by the tropical biases in the previous section and discussed in more detail in recent Global Energy and Water Cycle Experiment (GEWEX) Cloud System Study (GCSS) TOGA COARE intercomparisons (Petch et al. 2007; Willett et al. 2008). Further evidence was presented as part of the European Cloud Systems study (EUROCS) where Derbyshire et al. (2004) showed that mass flux profiles in the MetUM convective parameterization were unrealistic compared with Cloud Resolving Models (CRMs) and that the convective parameterization failed to respond correctly to variations in environmental humidity. In response to these deficiencies, a number of changes to the parameterization have been made. An adaptive detrainment parameterization for convection has been developed (Derbyshire et al. 2010, manuscript submitted to Quart. J. Roy. Meteor. Soc.), which relates detrainment to the buoyancy excess of the parcel. This replaces the “forced detrainment,” which only detrains when the buoyancy goes below a certain threshold (thus leading to step changes in the convective updraft mass–flux profile). This results in improved mass–flux profiles in which detrainment occurs more gradually over a greater number of model levels, leading to increased warming in the tropical upper troposphere, which reduces the cold bias in this region in the MetUM (Derbyshire et al. 2010, manuscript submitted to Quart. J. Roy. Meteor. Soc.).
A number of other improvements have also been made to the surface and boundary layer parameterizations. These address known deficiencies in momentum and scalar transports and include (i) modifying stable boundary layer turbulent mixing over the ocean through the use of short-tailed stability functions that better match large eddy simulations (LES), (ii) changes to the surface scalar transfer over the ocean bringing the dependence on wind speed more in line with observations (Edwards 2007)—the main impacts being a reduction in oceanic latent heat fluxes and reduced precipitation over the oceans, which is beneficial in terms of model systematic error (see Fig. 4), and (iii) a nonlocal scheme for momentum mixing in convective conditions. Without these nonlocal stresses, Brown and Grant (1997) showed that wind profiles in the boundary layer were less well mixed than suggested by LES studies. Changes (i) and (ii) are reported in more detail in Brown et al. (2008). The impacts of changes (i)–(iii) were generally smaller at climate time scales than those seen for adaptive detrainment. For the remainder of this section we will largely focus on the impacts of the adaptive detrainment on tropical performance.
1) Impacts on thermodynamic fields
Including adaptive detrainment results in smoother mass–flux profiles, which are in better agreement with CRMs (Derbyshire et al. 2010, manuscript submitted to Quart. J. Roy. Meteor. Soc.). Sensitivity tests carried out in short-range forecasts show the impact on the parameterized convective heating increments (K day−1) is twofold (Fig. 7). The shape of the convective heating is changed, with less heating in the midtropical troposphere and more heating aloft. The convective heating also penetrates higher in the troposphere, which again reduces a known systematic error in the model (Fig. 3), and is more responsive around the freezing level (about 5 km) where we see a minimum in convective heating compared with the old scheme with fixed threshold detrainment. The total (parameterized) diabatic heating also shows similar structural changes arising from convection detraining more gradually and extending higher in the atmosphere.
The impact of these changes in diabatic heating are manifest as reduced model biases in temperature and winds at medium-range and climate time scales (Fig. 8). The day 11–15 cold bias and the equatorial westerly wind bias in the upper tropical troposphere (Figs. 8c,d) are both reduced compared with the original biases (Figs. 1c,d). Similar reductions are also seen in forecast verification against radiosondes (Derbyshire et al. 2010, manuscript submitted to Quart. J. Roy. Meteor. Soc.). At climate time scales, the westerly wind biases are also significantly reduced (cf. Fig. 8b and Fig. 1b). The reduction in the temperature biases is discernable (Fig. 8a and Fig. 1a) but is smaller than that seen in the medium-range forecasts.
2) Impacts on precipitation and tropical circulation
Precipitation is reduced over the tropical oceans on both medium-range and climate time scales when adaptive detrainment is included (Figs. 4g,h). This is consistent with a reduction in the column-integrated diabatic heating (see Fig. 7). In particular, the excessive oceanic precipitation in the central Indian Ocean is reduced during June–August. This has been a persistent bias in MetUM Asian monsoon predictions at all time scales. Again the impacts are larger at the shorter time scale, where they lead to a significant reduction in the precipitation bias (cf. Figs. 4f,j). The smaller impact at climate time scales may suggest that other longer-time-scale errors may modulate the adaptive detrainment benefits. This requires further investigation within the seasonal modeling framework currently under development for the next-generation HadGEM family of climate models. One area where precipitation biases are worse is in the equatorial east Pacific, particularly at medium range.
With adaptive detrainment, the divergent errors over the Indian Ocean region are reduced (Figs. 5c,f), consistent with the changes in precipitation discussed above. However, on both climate and medium-range time scales, a large divergent error appears in the east Pacific. One possible hypothesis is that, prior to the introduction of adaptive detrainment, the forcing of too strong a Walker circulation and associated strong descent over the east Pacific tended to suppress convection in that region. The reduced diabatic heating/precipitation in the Indian Ocean and west tropical Pacific with adaptive detrainment improves the Walker circulation [as shown by improved equatorial near surface winds (Figs. 2b,c)], which allows convection to develop more readily in this region. This is perhaps a case where improving the error in one region has removed a compensating error in a remote region.
The introduction of adaptive detrainment and the associated reduction in divergent errors also gives large improvements of around 30% in the rotational flow (Figs. 6c,f) on a hemispheric scale. In the tropics the excessive westerly flow along the equator at 200 hPa is reduced, particularly on climate time scales. In short-range forecast tests, the introduction of the nonlocal momentum mixing (Brown et al. 2008) also further reduced the excessive wind speeds in the tropical boundary layer, but the remaining boundary layer revisions had little impact on the winds (not shown). The improvements in low-level equatorial winds and wind stress have implications for the ENSO response in the coupled model simulations (see section 5b).
4. Summer land surface error in Northern Hemisphere continental regions
a. Analysis of the error in NWP and climate simulations
On climate time scales, there are extensive warm biases in daily mean near-surface temperature over northern continents in summer [June to August (JJA); Fig. 9a]. Analysis of daily maximum and minimum temperatures shows that both daytime and nighttime warm biases contribute to this overall error (Fig. 9b). These summer warm biases lead to a poor simulation of the boreal forests when this model is coupled to interactive vegetation (Collins et al. 2008). Major changes in vegetation cover, such as Amazon dieback (Cox et al. 2004) could have major biogeophysical feedbacks on climate (Betts et al. 2004). However, reliable predictions of vegetation cover cannot be made if the initial vegetation state in the model contains large systematic errors. On NWP time scales, daytime temperatures over land in boreal summer, which are a key forecast product for customers, are also overestimated (not shown).
Previous studies have highlighted deficiencies in the simulation of clouds and aerosols in the MetUM on climate time scales. Compared with observations, HadGAM1 has too little cloud cover over both Central Asia and North America (Martin et al. 2006) and similar errors are seen in short-range forecasts (Milton and Earnshaw 2007; Williams and Brooks 2008). The deficits are mainly in low- and midlevel clouds of thick and intermediate optical depth. These errors in the cloud distribution result in an underestimation of the shortwave cloud radiative forcing (the difference between the total (or all-sky) radiation budget fields and those for cloud-free conditions) over both Eurasia and North America (Martin et al. 2006) In addition, both shortwave and longwave downward clear-sky fluxes are overestimated in HadGAM1. Aerosol optical depths are underestimated globally in HadGAM1 compared with satellite observations (N. Bellouin, personal communication), and surface measurements (Collins et al. 2008) and the error in clear-sky radiative fluxes is largely due to the lack of representation of natural (biogenic) continental aerosols and mineral dust aerosols in HadGEM1 (Bodas-Salcedo et al. 2008).
Examination of the spinup of the temperature biases over the first few days of an ensemble of HadGAM1 runs reveals that a positive surface-temperature bias develops in about 3 days over Central Asia, reaching an amplitude of about 2 K (Fig. 10). This warming extends throughout the lower troposphere and is accompanied by a decrease in midlevel cloud amounts and in surface latent heat flux (not shown). This rapid error growth suggests that at least some of the warm bias is related to the model physics rather than changes in atmospheric circulation.
One possible cause of the continental warm bias on climate time scales is that erroneously low summer soil moisture may reduce evaporative cooling of the surface, and hence result in erroneous surface warming. Soil moisture is notoriously difficult to validate as few reliable datasets exist. However, it is possible to compare the soil moisture in HadGAM1 with a previous model version, HadAM3P (Jones et al. 2006) in which the continental land surface temperatures are more realistic. Over Central Asia, the soil moisture is indeed noticeably lower in HadGAM1 than HadAM3P (Fig. 11). The peak in soil moisture in HadAM3P occurs in April when snowmelt is at its maximum in this region. In contrast, HadGAM1 shows little variation in soil moisture between January and April prior to the summer “dry down.” Further examination reveals that, while the majority of runoff in HadAM3P is subsurface, the surface component is dominant in HadGAM1. This is related to the change in runoff parameterization from Met Office Surface Exchange Scheme (MOSES)-I (Cox et al. 1999) to MOSES-II (Essery et al. 2001; see section 4b).
To isolate other possible sources of the warm bias we have evaluated boreal summer near surface temperatures, precipitation, and fluxes against observations from a site in northwest China (Tongyu) for an 18-day period during July 2003. This site was also studied in Milton and Earnshaw (2007) but here we extend the comparison to include satellite estimates of cloud and aerosol loading. The site is open grassland with a grass canopy of less than 10 cm year round, and although care must be taken in comparing a single site with a model NWP gridbox of 60 km, this location represents reasonably homogeneous terrain and vegetation. The observations and their method of measurement are outlined in Table 2.
The air temperature (Fig. 12a) shows MetUM has a daytime warm bias of 2°–4°C for 10 out of the 18 days, with a mean daily air temperature of 24.1°C compared with 23.5°C for the observations (Table 2). The remaining days have much smaller temperature biases and are characterized either by significant precipitation events, which the model captures reasonably well (Fig. 12i), or pristine clear skies such as 28 and 29 July [low cloud cover, low aerosol loadings, and maximum surface solar insolation (Figs. 12b,c,d)].
On the nonprecipitating cloudy days the downward shortwave (SW) surface radiative fluxes are overestimated by 100–200 W m−2, contributing to the surface warm bias. Possible reasons for this overestimate are (i) lack of cloud cover and/or too small cloud liquid/ice water contents, (ii) underestimates of the column water vapor, or (iii) a lack of aerosol radiative forcing. While it is beyond the scope of this paper to explore each of these hypotheses in depth, we have tried to evaluate model cloud and aerosol against available satellite estimates. Comparison with daily mean cloud fraction estimated from the Moderate Resolution Imaging Spectroradiometer (MODIS) suggests the model underestimates cloud cover (Fig. 12b). Comparison of the National Oceanic and Atmospheric Administration (NOAA) daily mean outgoing longwave radiation (OLR) of Liebmann and Smith (1996) with MetUM (Fig. 12f; Table 2) shows the day-to-day variability is actually well captured by the 12–24-h forecasts, but the model’s tendency is to slightly overestimate OLR on the nonprecipitating days, again consistent with too little cloud cover or too thin cloud. Comparison of the upward SW radiative fluxes at this site also show an underestimation of the surface albedo that contributes to the warm bias [see Milton and Earnshaw (2007) for discussion].
For the global NWP configurations the aerosol radiative forcing is parameterized by a simple climatology characterizing land–sea contrasts in aerosol loading (Cusack et al. 1998). For the Tongyu site the MetUM aerosol optical depth (AOD) is estimated at a constant 0.25, whereas the MODIS AOD is between 0.5 and 0.8 for most of the period, falling to lower values toward the end of July. The aerosol forcing in this region includes significant quantities of mineral dust blown from the Tibet/Mongolia region (Uno et al. 2006). Lack of aerosol radiative forcing in the NWP forecasts will clearly contribute to the excessive downward SW flux and warm bias at the surface. The climate configuration already contains a parameterization of major aerosol species (although there are deficiencies in these schemes as discussed above). There are plans for the NWP configurations to follow this lead.
The surface latent and sensible heat fluxes are both overestimated, with largest errors on the nonprecipitating cloudy days, consistent with the downward SW errors (Table 2; Figs. 12c,g,h). The exception is 14–16 and 31 July where latent heat flux is underestimated and sensible heat flux overestimated. This error in Bowen ratio arises from excessively dry soil moisture at this time as shown by comparisons with the soil moisture products from the Noah and Variable Infiltration Capacity (VIC) land surface models forced with observed fluxes and precipitation as part of the Global Land Data Assimilation System (GLDAS; Rodell et al. 2004). This dry bias may be linked to both errors in the treatment of runoff and also to the use of a poor climatological soil moisture used to constrain the initial soil moisture fields. It is possible that, if the soil moisture could be initialized accurately from observations, the near-surface warm bias may not develop in short-range forecasts. In fact, we may even see a cold bias if the cloud and SW errors were still present and led to excess evaporation. However, the drift in surface temperature would still suggest a physical link between short-range and climate errors. Since August 2005, the NWP configurations have improved initialization of soil moisture with a nudging scheme using near surface temperature and humidity (Best and Maisey 2002) and based on a similar approach to Mahfouf (1991).
In summary, the 12–24-h NWP forecasts show a similar daytime warm bias to the climate model. Comparison with in situ and satellite observations suggests the largest errors occur on nonprecipitating cloudy days due to lack of cloud or small cloud liquid water contents, resulting in too large downward SW radiation warming of the surface. Clearly the air temperature errors are also affected by (i) the accuracy of the soil moisture contents and partitioning of the SW flux sensible and latent heat fluxes, and (ii) deficiencies in the MetUM surface albedo. The errors in all variables are much smaller on days with either significant precipitation or pristine clear skies (low cloud fractions and AODs).
b. Testing proposed solutions
1) Clouds and aerosols
Following the above analysis, the treatment of clouds and aerosols in the MetUM has been investigated. Examination of the time-step behavior of convection in the MetUM shows that although instantaneous convective cloud properties are reasonable, the combination of intermittent triggering of convection scheme and the fact that the radiation scheme is only called every 3 h (for cost reasons) can result in underestimation of the radiative effects of convectively generated cloud, including anvil clouds (A. Lock 2007, personal communication). The convection scheme has been modified so that convective cloud properties are allowed to decay exponentially with a 2-h half life. This results in more continuous convective cloud and increased average convective cloud amount. The result is some small, but significant, improvements in the Northern Hemisphere continental temperatures, of around 1°C on climate time scales (Fig. 13a) and 0.4°C in short-range forecasts (Fig. 13d).
Several changes and additions to the representation of aerosol have been made during the development of HadGEM2-AO (Bellouin et al. 2007), including a climatology of secondary organic aerosol from biogenic terpene emissions [created using results from the Stochastic Chemistry Model (STOCHEM; Derwent et al. 2003)] and a mineral dust scheme (Woodward 2001). These changes improve the agreement in aerosol optical depth between model and observations, allow the seasonal variations in aerosols over the Northern Hemisphere continental regions to be captured (Bellouin et al. 2007), and lead to a reduction in the Northern Hemisphere continental temperatures of between 1° and 2°C (Fig. 13b) due to the direct radiative forcing from aerosols. The addition of biogenic aerosol in the NWP configurations at model cycle G44 (Table 1) also contributes to the overall reduction in 1.5-m temperature warm bias in forecasts over land during boreal summer (Fig. 13d).
2) Land surface characteristics
Typically, in springtime, large areas of the continental interiors have deep snow cover that melts over initially frozen and saturated soil. The surface parameterization used in HadAM3P (MOSES-I) deals with the excess water from the snowmelt by adding it to the downward moisture fluxes into the lower soil layers and eventually into subsurface runoff. However, the scheme used in HadGEM1 (MOSES-II) removes the excess water by adding it straight into the surface runoff if the top soil level is saturated. Thus, a significant proportion of the snowmelt is removed from the terrestrial system during spring, resulting in too little soil moisture by summer. The MOSES-II scheme has been modified in HadGEM2-AO to add such excess water over saturated soil into the downward moisture flux, as in MOSES-I. As a result, the soil as a whole is moister and vegetation suffers less water stress. The change to the treatment of runoff also has a major positive impact on the warm bias, reducing the average surface temperature by up to 4°C over Central Asia and by more than 2°C over North America (Fig. 13c). It should be noted that neither approach to the treatment of runoff from saturated soils is ideal. A more realistic approach, whereby some of the excess water percolates downward and some goes to surface runoff, is preferred and will be worked on in the future.
Following the inclusion of the mineral dust scheme in HadGEM2-AO, corresponding improvements to the albedo have been made, specifically, altering the bare soil albedo to match observations from MODIS (Moody et al. 2005; Houldcroft et al. 2009) and removing an artificial increase to the Saharan albedo, which was made to compensate for the lack of a mineral dust scheme in HadGEM1. These changes have global benefits and also help to alleviate the surface temperature biases over the northern continental regions. Similar improvements to bare soil albedo were implemented in the NWP configuration at cycle G44 (Table 1) with benefits for the surface energy balance and near-surface temperatures (Milton et al. 2008). Further improvements will be made to the albedo of vegetated surfaces following Houldcroft et al. (2009) in future configurations of the MetUM.
c. Impact of combined changes on systematic error
As described in the preceding section, several modifications have been introduced to alleviate the warm bias. When combined, the systematic warm bias over the Northern Hemisphere continental interiors is reduced substantially on climate time scales (Fig. 9c) and in short-range forecasts (Fig. 13d), with the surface runoff modification playing the largest role in this change (Fig. 13c). In addition to improving the warm bias, the modifications also increase the mean summer precipitation over western Central Asia, where it was too dry previously, and soil moisture increases by up to 25% (not shown). Over North America, the warm bias is also reduced significantly, particularly over the western part of the region. These changes in the mean near-surface air temperature bias are largely achieved through reductions in the maximum daytime temperatures (Fig. 9d). This is because several of the changes made to the model physics (e.g., the extended lifetime of convective cloud, the improved representation of aerosols, and the changes to surface albedo) mainly affect the daytime conditions. The remaining errors in nighttime minimum temperatures may have several causes and will require further detailed investigation.
5. Overall assessment of improvements
a. Model metrics
To illustrate the beneficial impact of a unified modeling strategy on model development, the performance of the resulting operational configurations of the MetUM at different time scales is now discussed. For many years, an assessment of the overall performance of an operational NWP model system has been made using a series of well-defined verification measures (or so-called model metrics) for key variables against observations. The World Meteorological Organization (WMO)-defined standard metrics have proved useful for model development at operational centers to focus on systematic errors and gauge their performance relative to other centers. However, care must be taken in interpreting such measures, given the focus on a relatively small number of variables, which can detract attention from improvements in other areas (e.g., model variability and extremes), that may be of increasing importance to customers. The climate modeling community has recently begun to utilize similar sets of measures, to objectively assess both the performance of individual models and changes in the performance of different generations of models, such as those used for different reports of the Intergovernmental Panel on Climate Change (IPCC). Again, these must be applied with a considerable amount of caution, as the value of such simple measures is even less clear when assessing a model for its capability to predict an unknown future climate. Here, we use these metrics but note that for assessing the performance of the MetUM across time scales, we additionally routinely carry out a broader process-based analysis of the models’ capabilities, such as those shown earlier in the paper.
As part of the assessment process of new model/assimilation formulations and as an overall measure of NWP performance the Met Office has used an “NWP Index” for a number of years. The NWP Index is made up of individual skill scores, measured against persistence, for meteorological fields compared with radiosondes, surface observations, and model analyses [see appendix A of Rawlins et al. (2007) for details]. The revisions to physical parameterizations that improved tropical performance (cycle G39) had a clear beneficial impact on the global NWP index components, with a 3%–7% reduction in individual RMS errors (Fig. 14) and 2.5-point improvement in the NWP index. A similar comparison (not shown) for the changes at cycle G44, designed to improve the continental warm bias, showed smaller impacts on the standard NWP index components but clearly had a positive impact on near-surface weather as discussed earlier.
In the climate community, a number of metrics aiming to give an overview of some general assessment of model performance against present-day climate observations or reanalyses now exist in the literature (e.g., Murphy et al. 2004; Reichler and Kim 2008a; Gleckler et al. 2008). Reichler and Kim (2008a) use a composite performance measure I2 to show that current climate models are more realistic in simulating present-day mean climate than their predecessors. The measure is based on composite normalized mean square errors over a broad range of variables. First I2 is derived by taking differences between simulated and observed mean climate in specific variables and over certain regions. Differences are scaled by the observed interannual variance prior to summing them up, helping to make outcomes from different variables more comparable. The resulting errors are further normalized by the average error found in all models participating in the third Climate Model Intercomparison Project (CMIP-3; more information online at http://www-pcmdi.llnl.gov/ipcc/about_ipcc.php), leading to model- and quantity-specific I2 values. Finally, for each model an average I2 is calculated by taking the mean I2 in its individual variables.
Figure 15 presents quantity-specific I2 values for different regions. The differences between the gray (HadGEM1) and black circles (HadGEM2-AO) demonstrate a clear improvement in HadGEM2-AO against HadGEM1, notably in the tropics (TR) across all of the variables shown here. There is also noticeable improvement in the Northern Hemisphere (NH) across all of the variables, including surface air temperature over land. The vertical bars in Fig. 15 display the range of outcomes from the twentieth century simulations (1979–99) of the CMIP-3 models. Although the HadGEM2-AO simulation used preindustrial forcings, while the CMIP-3 simulations (including HadGEM1) are present-day, Reichler and Kim (2008a) found that the impact of using preindustrial rather than present-day forcings on the validation against current climate was small compared with the impact of different model generations, and in fact tended to decrease I2. In other words, using preindustrial simulations for the other CMIP-3 models would widen even more the already existing gap between HADGEM2-AO and the other models. Thus, in terms of simulating present-day mean climate, HadGEM2-AO is overall in a leading position relative to the other CMIP-3 models. This also becomes clear from the final column in each panel, which shows the average I2 across 37 different variables used by Reichler and Kim (2008a,b).
b. Coupled model performance
1) ENSO
A key motivation for targeting tropical performance in the climate model was to improve the simulation of ENSO over that in HadGEM1. The combined impact of the changes implemented in HadGEM2-AO on the equatorial near-surface winds (Fig. 2b) shows a substantial improvement compared with HadGEM1 and this is also seen in the zonal mean wind stress over the critical Niño-4 region (Table 3). The changes in surface winds arising from the inclusion of adaptive detrainment are of comparable size to those in HadGEM2-AO (Fig. 2b), suggesting that much of the improvement in Niño-4 wind stress in HadGEM2-AO arises from this change. Other model changes in HadGEM2-AO, notably changes to the ocean background diffusivity (detailed in Collins et al. 2008), have also significantly reduced the mean global SST biases.
Together, we would expect these changes to have a positive impact on the mean state of the equatorial Pacific and the simulation of ENSO in the model, and many improvements compared with observations can be seen in the metrics listed in Table 3. In addition to the significant reduction in bias in mean Niño-4 wind stress and Niño-3 SST, the surface area of the Indo-Pacific warm pool (a key region for driving global atmospheric circulation) is substantially increased. The amplitude of the SST variability (as measured by the monthly standard deviation of the SST anomaly) across the Niño-3 region is improved relative to observations, and composite SST anomalies for El Niño events show an improvement in both the magnitude and spatial extent of the SST anomalies across the Pacific, which leads to a substantially improved response of precipitation to these anomalies (Fig. 16). The maximum precipitation anomaly, which was located to the west of the Maritime Continent in HadGEM1, has moved eastward to the west Pacific and a positive anomaly of greater than 0.5 mm day−1 is now present over much of the central and eastern equatorial Pacific. The horseshoe pattern of negative anomalies over the Maritime Continent and the regions extending to the southeast and northeast is also better represented. There is also an improvement in the correlation of Niño-3 SST with the Southern Oscillation index (Table 3).
However, the frequency of large El Niño events (>1.5 standard deviation) is reduced in HadGEM2-AO compared with both HadGEM1 and observations (Table 3) and a power spectrum analysis reveals a weak signal at the observed time scale (∼4 yr), noticeable power at 6–7 years and a dominant peak on decadal time scales (Collins et al. 2008, their Fig. 2.5). One of the mechanisms that leads to a change of phase of ENSO from El Niño to La Niña may be related to the thermocline mode of variability (Neelin et al. 1998; Guilyardi et al. 2003; Guilyardi 2006), which may be weakly simulated in both HadGEM1 and HadGEM2-AO. This mode seems to be much better simulated in a version of HadGEM1 with higher ocean and atmosphere resolution, as the ENSO power spectrum and variability found in that model is much more realistic (Shaffrey et al. 2009). Because the ENSO in HadGEM1 is confined close to the equator, and may be more of the “SST mode” form (see Guilyardi 2006), the ENSO in that model is able to change phase fairly regularly. In contrast, the ENSO in HadGEM2-AO has a more realistic north–south extent, but because it lacks the phase-changing process it tends to have a longer time scale than observed. Hence the removal of one error in the model may reveal other errors that were previously hidden. Clearly, an improvement in the capability of models to simulate the change of phase of ENSO will be a target for future models at all resolutions.
2) Earth-system feedbacks
A primary reason for improving the warm and dry biases in the physical model is to provide a more realistic surface continental climate for the growth and persistence of characteristic vegetation types when coupled to an interactive vegetation scheme as part of a full earth-system model. An indication of whether the package of changes described here has improved the surface continental climate sufficiently can be gained from examining the net primary productivity (NPP). This is the difference between the total carbon assimilated by photosynthesis and the carbon lost through plant respiration. NPP therefore represents the net uptake of carbon by the vegetation, so it is an important component of the terrestrial carbon cycle. Although the vegetation distribution in HadGEM2-AO and HadGEM1 is fixed, we can diagnose the NPP that would arise in a coupled earth-system model. The impact of the package of changes designed to address the continental near-surface temperature bias is to improve the NPP distribution (Fig. 17) compared with the International Satellite Land Surface Climatology Project (ISLSCP) dataset (Cramer et al. 1999).2 Whereas HadGEM1 shows significant negative biases in NPP over both continental regions, including some regions where the conditions are unsuitable for any vegetation growth (marked as missing data and left blank), with the combined modifications the biases are much smaller. A test of the HadGEM2-ES prototype atmosphere, with these changes included along with the interactive vegetation scheme, shows a substantial improvement in boreal tree density, soil carbon, and vegetation productivity, and also in the bare soil distribution, which is a useful prerequisite for improved interactions with the mineral dust scheme.
6. Summary
The reduction of systematic errors in general circulation models is a continuing challenge for improved climate and weather prediction. Feedbacks and compensating errors in climate models often make finding the source of a systematic error difficult. While there may be spinup errors in short-range forecasts that are not manifest on climate time scales, and similarly errors on climate time scales that only emerge after many months of integration, in this paper we have illustrated how the sources of those systematic biases, which appear very early on and persist on long time scales, can be identified by the use of the same model across a range of temporal and spatial scales. Two particular systematic errors have been examined: tropical circulation and precipitation distribution, and summer land surface temperature and moisture biases over Northern Hemisphere continental regions. Each of these was a cause for concern in both short-range forecasts and in climate simulations. In both cases, the errors were found to develop during the first few days of simulation. The ability to compare in detail the model diagnostics from the first few days of a forecast, initialized from a realistic atmospheric state, directly with observations has allowed deficiencies in the physical parameterizations to be identified which, when corrected, led to improvements on the full range of time scales, from a few days to several decades. There has been a marked improvement in the global NWP index, and the new climate model version, HadGEM2-AO, exhibits enhanced performance over its predecessors, HadGEM1 and third climate configuration of the Met Office Unified Model (HadCM3), and the ensemble of CMIP-3 models.
The unified modeling strategy employed by the Met Office has played a major role in this model development and evaluation process. However, observations of certain key quantities which may have contributed to our analysis of these systematic errors are still lacking across the range of time scales. For example, soil moisture measurements are limited to certain areas of the globe and a reliable long-term global record is lacking (Robock et al. 2000). This is being addressed through the development of a number of satellite-based measurements, such as the Advanced Scatterometer (ASCAT) (Bartalis et al. 2007; Wagner et al. 2007) and the Soil Moisture and Ocean Salinity mission (SMOS; more information online at http://smsc.cnes.fr/SMOS). Similarly, detailed information about cloud processes, ice and water contents and their conversion to precipitation is lacking. Experiments such as CloudSat (Stephens et al. 2002) are aiming to address this. Rainfall amounts over the oceans are also difficult to estimate because of the lack of in situ observations. Unfortunately, no satellite yet exists that can reliably identify rainfall and accurately estimate the rainfall rate in all circumstances. This is being addressed through measurement projects such as the Tropical Rainfall Measuring Mission (TRMM; more information online at http://trmm.gsfc.nasa.gov) and, in the future, the Global Precipitation Measurement mission (GPM; more information online at http://gpm.gsfc.nasa.gov).
Our study also highlights that the benefits of a unified prediction system across a wide range of time scales may only be fully realized if the full range of time scales is included. Until recently, the operational seasonal and decadal forecasting models used by the Met Office have differed from the model used for weather and climate time scales. This leaves a gap in the range of time scales between the medium-range (15 day) predictions and the centennial climate time scale. The seasonal and decadal modeling frameworks currently under development for the next-generation HadGEM family of climate models will enable the development of systematic errors on the monthly to decadal time scale to be studied. Finally, as Hurrell et al. (2009) point out, the drive toward increasingly complex and high-resolution models (e.g., those which represent earth-system feedbacks and regional extreme weather events) emphasizes the need for common processes to be addressed in a range of models of different resolution and complexity in order that progress can be made in all.
Acknowledgments
The authors thank Paul James and Dave Rowell for their initial analyses of the continental surface temperature bias. This work was supported by the Joint DECC and Defra Integrated Climate Programme - DECC/Defra (GA01101).
REFERENCES
Ackerman, T., and G. Stokes, 2003: The Atmospheric Radiation Measurement Program. Phys. Today, 56 , 38–44.
Allan, R., J. Lindesay, and D. E. Parker, 1996: El Nino–Southern Oscillation and Climatic Variability. CSIRO Publishing, 416 pp.
Allan, R., A. Slingo, S. F. Milton, and I. Culverwell, 2005: Exploitation of geostationary earth radiation budget data using simulations from a numerical weather prediction model: Methodology and data validation. J. Geophys. Res., 110 , D14111. doi:10.1029/2004JD005698.
Allan, R., A. Slingo, S. F. Milton, and M. Brooks, 2007: Evaluation of the Met Office global forecast model using Geostationary Earth Radiation Budget (GERB) data. Quart. J. Roy. Meteor. Soc., 133 , 1993–2010.
Bartalis, Z., W. Wagner, V. Naeimi, S. Hasenauer, K. Scipal, H. Bonekamp, J. Figa, and C. Anderson, 2007: Initial soil moisture retrievals from the METOP-A Advanced SCATterometer (ASCAT). Geophys. Res. Lett., 34 , L20401. doi:10.1029/2007GL031088.
Bellouin, N., O. Boucher, J. Haywood, C. Johnson, A. Jones, J. Rae, and S. Woodward, 2007: Improved representation of aerosols for HadGEM2. Hadley Centre Tech. Note 73, Met Office Hadley Centre. [Available online at http://www.metoffice.gov.uk/publications/HCTN/index.html].
Best, M. J., and P. E. Maisey, 2002: A physically based soil moisture nudging scheme. Hadley Centre Tech. Note 35, Met Office. [Available online at http://www.metoffice.gov.uk/publications/HCTN/index.html].
Betts, A., and C. Jakob, 2002: Study of diurnal cycle of convective precipitation over Amazonia using a single column model. J. Geophys. Res., 107 , 4732. doi:10.1029/2002JD002264.
Betts, R. A., P. M. Cox, M. Collins, P. P. Harris, C. Huntingford, and C. D. Jones, 2004: The role of ecosystem-atmosphere interactions in simulated Amazonian precipitation decrease and forest dieback under global climate warming. Theor. Appl. Climatol., 78 , 157–175.
Bodas-Salcedo, A., M. A. Ringer, and A. Jones, 2008: Evaluation of the surface radiation budget in the atmospheric component of the Hadley Centre Global Environmental Model (HadGEM1). J. Climate, 21 , 4723–4748.
Bowler, N., A. Arribas, K. Mylne, and K. Robertson, 2007a: Met Office Global and Regional Ensemble Prediction System (MOGREPS). Part I: System description. Forecasting Research Tech. Rep. 497, Met Office, 14 pp.
Bowler, N., A. Arribas, K. Mylne, K. Robertson, S. John, and T. Legg, 2007b: Met Office Global and Regional Ensemble Prediction System (MOGREPS). Part II: Case studies, performance and verification. Forecasting Research Tech. Rep. 498, Met Office, 14 pp.
Brown, A., and A. Grant, 1997: Non-local mixing of momentum in the convective boundary layer. Bound.-Layer Meteor., 84 , 1–22.
Brown, A., R. Beare, J. Edwards, A. Lock, S. Keogh, S. Milton, and D. Walters, 2008: Upgrades to the boundary layer scheme in the Met Office Numerical Weather Prediction model. Bound.-Layer Meteor., 128 , 117–132.
Ciesielski, P. E., R. H. Johnson, P. T. Haertel, and J. Wang, 2003: Corrected TOGA COARE sounding humidity data: Impact on diagnosed properties of convection and climate over the warm pool. J. Climate, 16 , 2370–2384.
Collins, W. J. Coauthors 2008: Evaluation of the HadGEM2 model. Met Office Hadley Centre Tech. Note 74, Met Office. [Available online at http://www.metoffice.gov.uk/publications/HCTN/index.html].
Cox, P. M., R. A. Betts, C. B. Bunton, R. L. H. Essery, P. R. Rowntree, and J. Smith, 1999: The impact of new land surface physics on the GCM simulation of climate and climate sensitivity. Climate Dyn., 15 , 183–203.
Cox, P. M., R. A. Betts, M. Collins, P. P. Harris, C. Huntingford, and C. D. Jones, 2004: Amazonian forest dieback under climate-carbon cycle projections for the 21st century. Theor. Appl. Climatol., 78 , 137–156.
Cramer, W. Coauthors 1999: Comparing global models of terrestrial net primary productivity (NPP): Overview and key results. Global Change Biol., 5 , 1–15.
Cubasch, U. Coauthors 2001: Projections of future climate change. Climate Change 2001: The Scientific Basis. J. T. Houghton et al., Eds., Cambridge University Press, 525–582.
Cullen, M., 1993: The Unified Forecast Climate Model. Meteor. Mag., 122 , 81–94.
Cusack, S., A. Slingo, J. Edwards, and M. Wild, 1998: The radiative impact of a simple aerosol climatology on the Hadley Centre climate model. Quart. J. Roy. Meteor. Soc., 124 , 2517–2526.
Davies, T., M. Cullen, M. H. Mawson, A. Staniforth, A. White, and N. Wood, 2005: A new dynamical core for the Met Office’s global and regional modelling of the atmosphere. Quart. J. Roy. Meteor. Soc., 131 , 1759–1782.
Derbyshire, S. H., I. Beau, P. Bechtold, J-Y. Grandpeix, J-M. Piriou, J-L. Redelsperger, and P. M. M. Soares, 2004: Sensitivity of moist convection to environmental humidity. Quart. J. Roy. Meteor. Soc., 130 , 3055–3079.
Derwent, R. G., W. J. Collins, M. E. Jenkin, C. E. Johnson, and D. S. Stevenson, 2003: The global distribution of secondary particulate matter in a 3-d Lagrangian chemistry transport model. J. Atmos. Chem., 44 , 57–95.
Edwards, J. M., 2007: Oceanic latent heat fluxes: consistency with the atmospheric hydrological and energy cycles and general circulation modeling. J. Geophys. Res., 112 , D06115. doi:10.1029/2006JD007324.
Essery, R., M. Best, and P. Cox, 2001: MOSES 2.2 technical documentation. Hadley Centre Tech. Note 30, Hadley Centre, Met Office. [Available online at http://www.metoffice.gov.uk/publications/HCTN/index.html].
Gates, W. L. Coauthors 1999: An overview of the results of the Atmospheric Model Intercomparison Project (AMIP I). Bull. Amer. Meteor. Soc., 80 , 29–55.
Gleckler, P. J., K. E. Taylor, and C. Doutriaux, 2008: Performance metrics for climate models. J. Geophys. Res., 113 , D06104. doi:10.1029/2007JD008972.
Gregory, D., and P. R. Rowntree, 1990: A mass flux convection scheme with representation of cloud ensemble characteristics and stability dependent closure. Mon. Wea. Rev., 118 , 1483–1506.
Gregory, D., and S. Allen, 1991: The effect of convective downdraughts upon NWP and climate simulations. Preprints, Ninth Conf. on Numerical Weather Prediction, Denver, CO, Amer. Meteor. Soc., 122–123.
Gregory, D., R. Kershaw, and P. M. Inness, 1997: Parametrization of momentum transport by convection. ii: Tests in single column and general circulation models. Quart. J. Roy. Meteor. Soc., 123 , 1153–1183.
Gregory, D., D. J. Shutts, and J. R. Mitchell, 1998: A new gravity wave drag scheme incorporating anisotropic orography and low level wave breaking: Impact upon the climate of the UK Meteorological Office Unified Model. Quart. J. Roy. Meteor. Soc., 124 , 463–493.
Guilyardi, E., 2006: El Nino-mean state-seasonal cycle interactions in a multi-model ensemble. Climate Dyn., 26 , 329–348.
Guilyardi, E., P. Delecluse, S. Guildi, and A. Navarra, 2003: Mechanisms for Enso phase change in a coupled GCM. J. Climate, 16 , 1141–1158.
Guilyardi, E., A. Wittenberg, A. Fedorov, M. Collins, C. Wang, A. Capotondi, G. J. van Oldenborgh, and T. Stockdale, 2009: Understanding El Nino in ocean–atmosphere general circulation models: Progress and challenges. Bull. Amer. Meteor. Soc., 90 , 325–340.
Hartmann, D. L., H. H. Hendon, and R. A. Houze, 1984: Some implications of the mesoscale circulations in tropical cloud clusters for large-scale dynamics and climate. J. Atmos. Sci., 41 , 113–121.
Houldcroft, C., W. Grey, M. Barnsley, C. Taylor, S. Los, and P. North, 2009: New vegetation albedo parameters and global fields of background albedo derived from MODIS for use in a climate model. J. Hydrometeor., 10 , 183–198.
Huffman, G., R. Adler, M. Morrisey, D. Bolvin, S. Curtis, R. Joyce, B. McGavock, and J. Susskind, 2001: Global precipitation at one degree daily resolution from multisatellite observations. J. Hydrometeor., 2 , 36–50.
Hurrell, J. W., and H. van Loon, 1997: Decadal variations in climate associated with the North Atlantic Oscillation. Climatic Change, 36 , 301–326.
Hurrell, J., G. A. Meehl, D. Bader, T. L. Delworth, B. Kirtman, and B. Wielicki, 2009: A unified modeling approach to climate system prediction. Bull. Amer. Meteor. Soc., 90 , 1819–1832.
Johns, T. C. Coauthors 2006: The new Hadley Centre climate model HadGEM1: Evaluation of coupled simulations. J. Climate, 19 , 1327–1353.
Jones, R. G., J. M. Murphy, D. C. Hassell, and M. J. Woodage, 2006: A high resolution atmospheric GCM for the generation of regional climate scenarios. Hadley Centre Tech. Note 63, Met Office Hadley Centre. [Available online at http://www.metoffice.gov.uk/publications/HCTN/index.html].
Klinker, E., and P. D. Sardesmukh, 1992: The diagnosis of mechanical dissipation in the atmosphere from large-scale balance requirements. J. Atmos. Sci., 49 , 608–627.
Koster, R. Coauthors 2004: Regions of strong coupling between soil moisture and precipitation. Science, 305 , 1138–1140.
Liebmann, B., and C. Smith, 1996: Description of a complete (interpolated) outgoing longwave radiation dataset. Bull. Amer. Meteor. Soc., 77 , 1275–1277.
Lock, A. P., A. R. Brown, M. R. Bush, G. M. Martin, and R. N. B. Smith, 2000: A new boundary layer mixing scheme. Part I: Scheme description and single-column model tests. Mon. Wea. Rev., 128 , 3187–3199.
Lorenc, A. C. Coauthors 2000: The Met. Office global three-dimensional variational data assimilation scheme. Quart. J. Roy. Meteor. Soc., 126 , 2991–3012.
Mahfouf, J-F., 1991: Analysis of soil moisture from near-surface parameters: A feasibility study. J. Appl. Meteor., 30 , 1534–1547.
Martin, G. M., M. R. Bush, A. R. Brown, A. P. Lock, and R. N. S. Smith, 2000: A new boundary layer mixing scheme. Part II: Tests in climate and mesoscale models. Mon. Wea. Rev., 128 , 3200–3217.
Martin, G. M., M. A. Ringer, V. D. Pope, A. Jones, C. Dearden, and T. J. Hinton, 2006: The physical properties of the atmosphere in the new Hadley Centre Global Environmental Model (HadGEM1). Part I: Model description and global climatology. J. Climate, 19 , 1274–1301.
Milton, S. F., and C. A. Wilson, 1996: The impact of parameterized subgrid-scale orographic forcing on systematic errors in a global NWP model. Mon. Wea. Rev., 124 , 2023–2045.
Milton, S. F., and P. Earnshaw, 2007: Evaluation of surface water and energy cycles in the Met Office Global NWP model using CEOP data. J. Meteor. Soc. Japan, 85 , 43–72.
Milton, S. F., I. Culverwell, and D. Cameron, 2001: Validation of parametrized forcing in NWP and climate models. Proc. ECMWF Seminar: Key Issues in the Parametrization of Subgrid Physical Processes, Shinfield Park, Reading, United Kingdom, ECMWF, 253–274.
Milton, S. F., G. Greed, M. Brooks, J. Haywood, B. Johnson, R. Allan, A. Slingo, and W. Grey, 2008: Modeled and observed atmospheric radiation balance during the West African dry season: Role of mineral dust, biomass burning aerosol, and surface albedo. J. Geophys. Res., 113 , D00C02. doi:10.1029/2007JD009741.
Moody, E. G., M. D. King, S. Platnick, C. B. Schaaf, and F. Gao, 2005: Spatially complete global spectral surface albedos: Value-added datasets derived from terra MODIS land products. IEEE Trans. Geosci. Remote Sens., 43 , 144–158.
Murphy, J. M., D. M. H. Sexton, D. N. Barnett, G. S. Jones, M. J. Webb, M. Collins, and D. A. Stainforth, 2004: Quantification of modeling uncertainties in a large ensemble of climate change simulations. Nature, 430 , 768–772.
Neelin, J. D., D. S. Battisti, A. C. Hirst, F-F. Jin, Y. Wakata, T. Yamagata, and S. E. Zebiak, 1998: ENSO theory. J. Geophys. Res., 103 , (C7). 14261–14290.
New, M., M. Hulme, and P. Jones, 1999: Representing twentieth-century space–time climate variability. Part I: Development of a 1961–90 mean monthly terrestrial climatology. J. Climate, 12 , 829–856.
Petch, J., M. Willett, R. Wong, and S. Woolnough, 2007: Modelling suppressed and active convection. Comparing a numerical weather prediction, cloud resolving and single column model. Quart. J. Roy. Meteor. Soc., 133 , 1087–1100.
Phillips, T. J. Coauthors 2004: Evaluating parameterizations in General Circulation Models: Climate simulation meets weather prediction. Bull. Amer. Meteor. Soc., 85 , 1903–1915.
Randall, D. A. Coauthors 2007: Climate models and their evaluation. Climate Change 2007: The Physical Science Basis, S. Solomon et al., Eds., Cambridge University Press, 589–662.
Rawlins, F., S. Ballard, K. Bovis, A. Clayton, D. Li, G. Inverarity, A. Lorenc, and T. Payne, 2007: The Met Office global 4-dimensional variational data assimilation scheme. Quart. J. Roy. Meteor. Soc., 133 , 347–362.
Reichler, T., and J. Kim, 2008a: How well do coupled models simulate today’s climate? Bull. Amer. Meteor. Soc., 89 , 303–311.
Reichler, T., and J. Kim, 2008b: Uncertainties in the climate mean state of global observations, reanalyses, and the GFDL climate model. J. Geophys. Res., 113 , D05106. doi:10.1029/2007JD009278.
Ringer, M. A. Coauthors 2006: The physical properties of the atmosphere in the new Hadley Centre Global Environmental Model (HadGEM1). Part II: Aspects of variability and regional climate. J. Climate, 19 , 1302–1326.
Robock, A., K. Y. Vinnikov, G. Srinivasan, J. K. Entin, S. E. Hollinger, N. A. Speranskaya, S. Liu, and A. Namkhai, 2000: The global soil moisture data bank. Bull. Amer. Meteor. Soc., 81 , 1281–1299.
Rodell, M. Coauthors 2004: The global land data assimilation system. Bull. Amer. Meteor. Soc., 85 , 381–394.
Rodwell, M. J., and T. N. Palmer, 2007: Using numerical weather prediction to assess climate models. Quart. J. Roy. Meteor. Soc., 133 , 129–146.
Savage, N., S. Milton, D. Walters, and J. Heming, 2006: An assessment of the impact of new physical parametrisations on the performance of the Global Unified Model in THORPEX 15-day forecasts. Forecasting Research Tech. Rep. 496, Met Office, 30 pp.
Scaife, A. A., N. Butchart, C. D. Warner, D. Stainforth, W. Norton, and J. Austin, 2000: Realistic quasi-biennial oscillations in a simulation of the global climate. Geophys. Res. Lett., 27 , 3481–3484.
Shaffrey, L. Coauthors 2009: U.K.-HiGEM: The new U.K. high-resolution global environment model—Model description and basic evaluation. J. Climate, 22 , 1861–1896.
Slingo, J. M. Coauthors 1996: Intraseasonal oscillations in 15 atmospheric general circulation models: Results from an AMIP diagnostic subproject. Climate Dyn., 12 , 325–357.
Solomon, S., D. Qin, M. Manning, Z. Chen, M. Marquis, K. B. Averyt, M. Tignor, and H. L. Miller, Eds. 2007: Climate Change 2007: The Physical Science Basis. Cambridge University Press, 996 pp.
Sperber, K. R., and T. N. Palmer, 1996: Interannual tropical rainfall variability in general circulation model simulations associated with the atmospheric model intercomparison project. J. Climate, 9 , 2727–2750.
Stephens, G. L. Coauthors 2002: The CloudSat mission and the A-Train. Bull. Amer. Meteor. Soc., 83 , 1771–1790.
Titley, H., N. Savage, R. Swinbank, and S. Thompson, 2008: Comparison between Met Office and ECMWF medium-range ensemble forecast systems. Forecasting Research Tech. Rep. 512, Met Office, 41 pp.
Uno, I. Coauthors 2006: Dust model intercomparison (DMIP) study over Asia: Overview. J. Geophys. Res., 111 , D12213. doi:10.1029/2005JD006575.
Uppala, S. M. Coauthors 2005: The ERA-40 re-analysis. Quart. J. Roy. Meteor. Soc., 131 , 2961–3012.
Van Oldenborgh, Y. P., S. Y. Philip, and M. Collins, 2005: El Nino in a changing climate: A multi-model study. Ocean Sci., 1 , 81–95.
Wagner, W., G. Blöschi, P. Pampaloni, J-C. Calvet, B. Bizzarri, J-P. Wigneron, and Y. Kerr, 2007: Operational readiness of microwave remote sensing of soil moisture for hydrologic applications. Nord. Hydrol., 38 , 1–20.
Webster, P., and R. Lucas, 1992: TOGA COARE: The Coupled Ocean–Atmosphere Response Experiment. Bull. Amer. Meteor. Soc., 73 , 1377–1416.
Willett, M., P. Bechtold, J. Petch, S. Milton, D. Williamson, and S. Woolnough, 2008: Modelling suppressed and active convection. Comparisons between three global atmospheric models. Quart. J. Roy. Meteor. Soc., 134 , 1881–1896.
Williams, K. D., and M. E. Brooks, 2008: Initial tendencies of cloud regimes in the Met Office unified model. J. Climate, 21 , 833–840.
Woodward, S., 2001: Modeling the atmospheric life cycle and radiative impact of mineral dust in the Hadley Centre climate model. J. Geophys. Res., 106 , D1618155. doi:10.1029/2000JD900795.
Xie, P., and P. A. Arkin, 1997: Global precipitation: A 17-year monthly analysis based on gauge observations, satellite estimates, and numerical model outputs. Bull. Amer. Meteor. Soc., 78 , 2539–2558.
Yang, G-Y., and J. Slingo, 2001: The diurnal cycle in the tropics. Mon. Wea. Rev., 129 , 784–801.
Zhang, C., 2005: Madden–Julian Oscillation. Rev. Geophys., 43 , 1–36.

(left) Zonally averaged temperature and (right) zonal wind biases for June–August (JJA): (a),(b) 10-yr mean climatology from HadGAM1 (N96) − 40-yr ECMWF Re-Analysis (ERA-40); (c),(d) days 11–15 of NWP medium range (MOGREPS) forecasts (N144) − Met Office analyses for JJA 2003 and 2006; (e),(f) as in (c),(d), but for days 1–5; (g),(h) days 1–5 operational NWP forecasts (N216) − Met Office analyses for JJA 2004.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

(left) Zonally averaged temperature and (right) zonal wind biases for June–August (JJA): (a),(b) 10-yr mean climatology from HadGAM1 (N96) − 40-yr ECMWF Re-Analysis (ERA-40); (c),(d) days 11–15 of NWP medium range (MOGREPS) forecasts (N144) − Met Office analyses for JJA 2003 and 2006; (e),(f) as in (c),(d), but for days 1–5; (g),(h) days 1–5 operational NWP forecasts (N216) − Met Office analyses for JJA 2004.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1
(left) Zonally averaged temperature and (right) zonal wind biases for June–August (JJA): (a),(b) 10-yr mean climatology from HadGAM1 (N96) − 40-yr ECMWF Re-Analysis (ERA-40); (c),(d) days 11–15 of NWP medium range (MOGREPS) forecasts (N144) − Met Office analyses for JJA 2003 and 2006; (e),(f) as in (c),(d), but for days 1–5; (g),(h) days 1–5 operational NWP forecasts (N216) − Met Office analyses for JJA 2004.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Equatorial Pacific (5°N–5°S) 1000-hPa zonal wind component for (a) Met Office analyses and day 1, day 3, and day 5 NWP forecasts (cycle G32) during JJA 2004; (b) climatologies from HadGAM1, HadGAM1 and adaptive detrainment (10-yr average), and HadGEM2-AO (30-yr average) for JJA; and (c) impact of revised physics package (cycle G39) vs cycle G38 from trials during August 2005.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Equatorial Pacific (5°N–5°S) 1000-hPa zonal wind component for (a) Met Office analyses and day 1, day 3, and day 5 NWP forecasts (cycle G32) during JJA 2004; (b) climatologies from HadGAM1, HadGAM1 and adaptive detrainment (10-yr average), and HadGEM2-AO (30-yr average) for JJA; and (c) impact of revised physics package (cycle G39) vs cycle G38 from trials during August 2005.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1
Equatorial Pacific (5°N–5°S) 1000-hPa zonal wind component for (a) Met Office analyses and day 1, day 3, and day 5 NWP forecasts (cycle G32) during JJA 2004; (b) climatologies from HadGAM1, HadGAM1 and adaptive detrainment (10-yr average), and HadGEM2-AO (30-yr average) for JJA; and (c) impact of revised physics package (cycle G39) vs cycle G38 from trials during August 2005.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Mean profiles of (a) temperature error (model sonde) and (b) relative humidity for July–September 2003 from the MetUM and sondes from ARM tropical warm pool (TWP) site at Manus Island. The sonde data are available at 0000 and 1200 UTC; FC0612: 6 to 12 h forecast; FC1824: 18 to 24 h forecast; FC3036: 30 to 36 h forecast.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Mean profiles of (a) temperature error (model sonde) and (b) relative humidity for July–September 2003 from the MetUM and sondes from ARM tropical warm pool (TWP) site at Manus Island. The sonde data are available at 0000 and 1200 UTC; FC0612: 6 to 12 h forecast; FC1824: 18 to 24 h forecast; FC3036: 30 to 36 h forecast.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1
Mean profiles of (a) temperature error (model sonde) and (b) relative humidity for July–September 2003 from the MetUM and sondes from ARM tropical warm pool (TWP) site at Manus Island. The sonde data are available at 0000 and 1200 UTC; FC0612: 6 to 12 h forecast; FC1824: 18 to 24 h forecast; FC3036: 30 to 36 h forecast.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Tropical precipitation (40°N–40°S) in mm day−1 for (a) CMAP JJA 1979–98, (b) CMAP − GPCP (v2.0) JJA 1979–98, (c) HadGAM1 JJA 1979–98, (d) HadGAM1 − CMAP, (e) medium-range NWP forecasts for JJA 2003 and 2006, (f) medium-range NWP forecasts − GPCP, (g) HadGAM1 (plus adaptive detrainment) − HadGAM1, (h) impact of cycle G39 physics revisions on medium-range NWP forecasts (cycle G39 − G38), (i) HadGAM1 (plus adaptive detrainment) − CMAP, and (j) NWP medium-range cycle G39 − GPCP.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

Tropical precipitation (40°N–40°S) in mm day−1 for (a) CMAP JJA 1979–98, (b) CMAP − GPCP (v2.0) JJA 1979–98, (c) HadGAM1 JJA 1979–98, (d) HadGAM1 − CMAP, (e) medium-range NWP forecasts for JJA 2003 and 2006, (f) medium-range NWP forecasts − GPCP, (g) HadGAM1 (plus adaptive detrainment) − HadGAM1, (h) impact of cycle G39 physics revisions on medium-range NWP forecasts (cycle G39 − G38), (i) HadGAM1 (plus adaptive detrainment) − CMAP, and (j) NWP medium-range cycle G39 − GPCP.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1
Tropical precipitation (40°N–40°S) in mm day−1 for (a) CMAP JJA 1979–98, (b) CMAP − GPCP (v2.0) JJA 1979–98, (c) HadGAM1 JJA 1979–98, (d) HadGAM1 − CMAP, (e) medium-range NWP forecasts for JJA 2003 and 2006, (f) medium-range NWP forecasts − GPCP, (g) HadGAM1 (plus adaptive detrainment) − HadGAM1, (h) impact of cycle G39 physics revisions on medium-range NWP forecasts (cycle G39 − G38), (i) HadGAM1 (plus adaptive detrainment) − CMAP, and (j) NWP medium-range cycle G39 − GPCP.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

The 200-hPa divergent flow (velocity potential) during JJA for (a) ERA-40 analysis, (b) HadGAM1 − ERA-40, (c) HadGAM1 (plus adaptive detrainment) − ERA-40, (d) Met Office analysis for JJA 2003 and 2006, (e) days 11–15 of MOGREPS-15 forecasts JJA 2003 and 2006 − Met Office analysis, and (f) MOGREPS-15 plus cycle G39 physics (includes adaptive detrainment) − Met Office analysis. Units are 2 × 106 m2 s−2 for analyses and 1 × 106 m2 s−2 for forecast differences.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1

The 200-hPa divergent flow (velocity potential) during JJA for (a) ERA-40 analysis, (b) HadGAM1 − ERA-40, (c) HadGAM1 (plus adaptive detrainment) − ERA-40, (d) Met Office analysis for JJA 2003 and 2006, (e) days 11–15 of MOGREPS-15 forecasts JJA 2003 and 2006 − Met Office analysis, and (f) MOGREPS-15 plus cycle G39 physics (includes adaptive detrainment) − Met Office analysis. Units are 2 × 106 m2 s−2 for analyses and 1 × 106 m2 s−2 for forecast differences.
Citation: Journal of Climate 23, 22; 10.1175/2010JCLI3541.1