A Diagnostic Evaluation of Precipitation in CORDEX Models over Southern Africa

AbstractThe authors evaluate the ability of 10 regional climate models (RCMs) to simulate precipitation over Southern Africa within the Coordinated Regional Climate Downscaling Experiment (CORDEX) framework. An ensemble of 10 regional climate simulations and the ensemble average is analyzed to evaluate the models' ability to reproduce seasonal and interannual regional climatic features over regions of the subcontinent. All the RCMs use a similar domain, have a spatial resolution of ~50 km, and are driven by the Interim ECMWF Re-Analysis (ERA-Interim; 1989–2008). Results are compared against a number of observational datasets.In general, the spatial and temporal nature of rainfall over the region is captured by all RCMs, although individual models exhibit wet or dry biases over particular regions of the domain. Models generally produce lower seasonal variability of precipitation compared to observations and the magnitude of the variability varies in space and time. Model biases are related to model setup, ...


Introduction
The aim of this paper to make use of the regional climate model (RCM) control simulations results produced within the Coordinated Regional Climate Downscaling Experiment (CORDEX; Giorgi et al. 2009;Jones et al. 2011) initiative to provide an evaluation of the models' ability to reproduce the regional climatic features over Southern Africa. Interannual variability patterns and climatic means are examined, with focus on precipitation biases and related large-scale circulation and sea level pressure (SLP) fields.
As evidence of climate change and its impacts continue to emerge, Africa is found to be most vulnerable because of the dependence of communities on natural resources and low adaptive capacity due to limited access to information, finances, technology, and capital assets. Agricultural production (e.g., Lobell et al. 2008), livestock (e.g., Thornton et al. 2009), food security (Hertel et al. 2010), freshwater resources, and socioeconomic structures in many African countries are particularly vulnerable to the effects of climate change (Boko et al. 2007;Chevallier 2010). Africa is characterized by low adaptive capacity and high vulnerability so a response to the threat of climate change requires both mitigation and adaptation measures. The existing development challenges from nonclimate factors faced by most African countries reduce their capacity to adapt to additional climate stresses. For example, the limited access to markets, infrastructure, technology, data, and information further reduce the adaptive potential and increase the continent's vulnerability (Washington et al. 2006).
Low-resolution global climate models (GCMs) show that the twenty-first century warming rates over Africa will outpace global warming in all seasons, with the dry subtropics such as southwestern Africa warming faster than the wet tropical regions (Christensen et al. 2007;James and Washington 2013). Over Southern Africa a higher frequency of extreme temperature events such as heat waves is associated with the higher warming rates Orlowsky and Seneviratne 2012). Southwestern Africa is projected to become drier (reduction in soil moisture) and will experience an increase in the frequency of consecutive dry days (Tebaldi et al. 2006;Sillmann and Roeckner 2008;Orlowsky and Seneviratne 2012), with the dry conditions and droughts intensifying toward the end of the twenty-first century (Hoerling et al. 2006;Shongwe et al. 2009). The eastern part of Southern Africa is projected to become wetter, although there are indications that the character of precipitation will feature fewer intense rainfall events separated by a high frequency of dry days, particularly in convective regions (Tebaldi et al. 2006;Rocha et al. 2008). Confidence in future projections of precipitationrelated weather variables is however precluded by inability of GCMs to include many important small-scale processes and factors that underlie their occurrence and variability in time and space (Randall et al. 2007). To address these shortcomings and to support informed decision making processes with climate information at relevant spatial and temporal resolution, downscaling procedures have been developed and applied to derive regional-to local-scale information from coarseresolution GCMs.
Two downscaling techniques are used to produce regional climate information. Statistical downscaling used empirical relationships between large-scale atmospheric variables and local climate variables and has demonstrated the potential to produce useful and applicable information over Southern Africa (e.g., Hewitson and Crane 2006). Precipitation projections derived from simulated synoptic-scale circulation changes using statistical downscaling show a consensus across GCMs, with the eastern parts of South Africa projected to become wetter toward the end of the twenty-first century (Hewitson and Crane 2006). However, the application of statistical downscaling methods is hampered in certain parts of Southern Africa (e.g., Angola and the Democratic Republic of Congo) by the lack or dearth of long-enough observational data required at point or station scale to train and validate the statistical model. An alternative but computationally intensive approach that does not suffer from this restraint and captures mesoscale nonlinear effects is dynamical downscaling using high-resolution RCMs or global models with variable spatial resolution (Giorgi 1990; Giorgi and Mearns 1999).
RCMs are a widely used tool for producing regional climate data that use boundary conditions from a forcing coarse-scale global dataset such as a GCM or reanalysis to simulate the climate of a particular region. Previous studies demonstrate that RCMs, while showing some improvements over the driving GCMs in simulating precipitation over Southern Africa, also show some persistent biases (Arnell et al. 2003;Haensler et al. 2011) such as having unrealistic rain day frequency and rainfall intensity (Tadross et al. 2006). RCM performance is found to be dependent on the internal physics (e.g., hydrostatic versus nonhydrostatic; Hewitson et al. 2004;Tadross et al. 2006), dynamics, and atmosphere-land surface feedbacks associated with soil moisture and vegetation cover (Tadross et al. 2005;Williams and Kniveton 2012). Dependence of the simulated climate change on the driving model, time slice, season, and location has been apparent. In close agreement with precipitation changes projected by GCMs (Shongwe et al. 2009) and statistical downscaling (Hewitson and Crane 2006), RCMs give indications of drying over southwestern Africa and wetter conditions to the southeast of South Africa (Tadross et al. 2006;Engelbrecht et al. 2009, Haensler et al. 2011. Downscaling techniques suffer from uncertainties inherited from the driving GCMs and from those associated with their internal workings. To systematically explore uncertainties associated with dynamical downscaling, coordinated RCM simulations are necessary. Such coordinated efforts are available in other regions, such as Europe (e.g., Christensen and Christensen 2007), North America (Mearns et al. 2009), and South America (Men endez et al. 2010). CORDEX is the first effort to that focuses on the whole of Africa and furthermore treats it as a priority domain (Giorgi et al. 2009). CORDEX consists of two phases: in the first phase downscaling centers are asked to downscale a reference/verification period using Interim European Centre for Medium-Range Project phase 5 (CMIP5) GCMs from the period 1951-2100.
The extent to which CORDEX RCMs can adequately capture the important climate features and simulate key variables such as precipitation in Africa is not yet fully understood. However, this is an essential prerequisite as climate information obtained from the high-resolution CORDEX data will be used for climate change adaptation strategies and other applications. There is therefore a need to objectively evaluate the ability of CORDEX reference runs to simulate key climate variables, particularly precipitation, as an essential first step prior to assessing downscaled climate change scenario results over Africa.
African CORDEX results already published include Nikulin et al. (2012) and Kim et al. (2013), who demonstrate the models simulate the mean seasonal and annual cycle adequately, although there are spatial biases amongst the model ensemble. Their results show that the multimodel average generally outperforms any individual simulation and that the RCMs significantly improve the precipitation climate compared to the driving ERA-Interim. Nikulin et al. (2012) also note that in the majority of the RCMs precipitation is triggered too early during the diurnal cycle. Kim et al. (2013) also assessed CORDEX models and show systematic biases across all models for different regions (especially for precipitation in drier regions such as South Africa). They conclude that this poses difficulty in defining a single representative index when validating models over broad areas and suggest that regional-specific analyses should be performed. In agreement with Nikulin et al. (2012), Kim et al. (2013) also conclude that the multimodel ensemble average generally outperforms individual ensemble members. Hern andez-D ıaz et al. (2013) analyzed the CRCM5 (see Table 1 for model expansions) and found it to adequately capture the seasonal precipitation climatology over much of Africa and the annual cycle over southeastern Africa east of 208E. Slight biases were found in simulating the diurnal precipitation cycle.
In this paper, we apply statistical approaches to evaluate rainfall simulation by 10 CORDEX RCMs over Southern Africa. To the extent possible on the basis of climate variables available in the current CORDEX archive, the ability of the RCMs to simulate known circulation patterns underlying Southern African climate is also assessed. In section 2, the data used in the study and the methods of data analyses are described. Results from our analyses are presented in section 3. Section 4 discusses potential sources of model bias over the region, and section 5 summarizes and concludes.

Data and methodology
Data for this analysis were produced by many different downscaling groups across the globe (see Table 1). Initially rainfall and temperature data were provided by the downscaling groups to the Swedish Meteorological and Hydrological Institute (SMHI), who postprocessed these data so they had identical grids, domain sizes, and data formats for direct comparison. These data were provided at the monthly time scale to the regional groups of the CORDEX-Africa analysis initiative, which started in 2011. However, as the analysis progressed additional prognostic variables were desired to understand the downscaled results: for example, winds at particular levels to examine the presence of jets and humidity at multiple levels to understand moisture transport. Unfortunately, these variables could often not be made available at a later stage by most centers (for reasons such as space and model upgrades) and subsequently many variables were not available for inclusion in this analysis. However, through this experience CORDEX has evolved very strict data standards and requirements as well as distribution nodes, which will facilitate direct access to CORDEX data in the future.
The evaluation of precipitation results in the reference climate simulations is restricted by the availability of high quality observational datasets for the region of study that are at the appropriate spatial resolution and for the required time period. Several observational datasets are available for model evaluation and have been reviewed to some extent in Nikulin et al. (2012) and Sylla et al. (2013). Nikulin et al. (2012) show large differences between satellite-and gauge-based products and highlight the lack of reporting station gauges over large parts of Africa. Sylla et al. (2013) state that the uncertainty in the observations is a key factor preventing a rigorous and unambiguous evaluation of climate models over Africa.
To obtain an estimate of the spread in the precipitation values between the individual observed fields and provide a sense of observational uncertainty, model results were compared against a number of different observational datasets. Three gauge-based gridded observational datasets are used, which are available at 0.58 latitude-longitude grid and include the region of Southern Africa and the time period 1990-2006: the Global Precipitation Climatology Centre (GPCC), version 5 (Rudolf et al. 2010); the University of Delaware (UDEL), version 2.01 (Legates and Willmott 1990); and the Climatic Research Unit (CRU) at the University of East Anglia, version 3.0 (Mitchell and Jones 2005). For the limited time period of 1998-2006, the results from the gaugebased datasets are compared against the satellite-based dataset from the Tropical Rainfall Measuring Mission (TRMM) 3B42, version 6 (Huffman et al. 2007) with a 0.258 spatial resolution and the satellite-gauge dataset from the Global Precipitation Climatology Project (GPCP), version 1.1 (Huffman et al. 2001), with a 18 spatial resolution. The different gridded products agree quite well in terms of large-scale precipitation patterns (Gruber et al. 2000;Fekete et al. 2004), though the scarcity of station data in certain regions of Southern Africa (Nikulin et al. 2012), compared to other regions of the world, can lead to local deviations. In accordance to other studies, the TRMM dataset is found to have a dry precipitation bias in the entire north part of Southern Africa during the wet season (e.g., Nikulin et al. 2012) and a spatially confined dry bias in the southeastern part of South Africa during the austral winter season. The GPCC, UDEL, and GPCP data were compared against mean monthly precipitation data available for 428 stations in South Africa and Mozambique for the period 1990-2006 and all three datasets were found to compare well against point observations (appendix). Although GPCC has an overall slightly better performance than the UDEL and GPCP gridded datasets, local differences can occur (see appendix). The GPCC dataset was chosen as the reference observational dataset as it is of a sufficiently long time series that it covers the CORDEX  Bougeault 1985Tiedtke 1989Grell 1993Fritsch and Chappell 1980Tiedtke 1989Tiedtke 1989 Radiation scheme Morcrette 1990 Fouquart andBonnel 1980;Mlawer et al. 1997Kiehl et al. 1996Ritter and Geleyn 1992Fouquart and Bonnel 1980 Turbulence vertical diffusion Mellor and Yamada 1982Louis 1979Holtslag et al. 1990Herzog et al. 2002Buzzi et  downscaling period (1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008), it has same resolution as the RCM simulated data, and the regions used in the analysis were classified based on GPCC data. Other datasets are also compared against GPCC, and areas where station data are scarce and larger deviations may be expected are highlighted. Nevertheless, the primary goal of the work presented in this paper, which is to assess the models' relative performance, is not found to be affected by the choice of observational dataset. We evaluate the downscaled atmospheric fields against ERA-Interim, as this was also the reanalysis dataset chosen by the CORDEX initiative to provide the lateral boundary conditions for the RCM reference climate simulations. The third-generation ERA-Interim includes a number of improvements over the 40-yr ECMWF Re-Analysis (ERA-40; Dee et al. 2011). It has been found to agree well with gauge-based and combined gauge and satellite products (Simmons et al. 2010) and is thus considered an appropriate dataset for the assessment of model performance for a number of climate variables. Sea level pressure and wind fields at the 850-hPa level were used to assess the models' ability to reproduce a physically consistent framework that captures the global circulation patterns. ERA-Interim precipitation is also used in the comparisons to provide a measure of uncertainty between reanalysis data and observations. A  Samuelsson et al. 2011Jones et al. 2004Skamarock et al. 2008Zadra et al. 2008 detailed comparison of precipitation in reanalysis data and observations over Southern Africa can be found in Zhang et al. (2013). The 10 different RCMs described in Table 1 were used to simulate the African climate during the period 1990-2008 (1989 was spinup). For the CORDEX reference runs addressed in this paper, all the models have been forced using ERA-Interim data as boundary conditions and the models were not nudged over the domain. To facilitate comparison against the different observational datasets, the results presented focus on the period 1990-2006. Despite that each model used its individual setup (number of vertical levels, grid rotation, parameterization schemes, etc.), a common horizontal resolution of 0.448 was required. In the results presented, this is used as a reference grid and all datasets on other grids are interpolated to this common resolution. The TRMM dataset was aggregated from the native 0.258 grid onto the reference grid by a first-order conservative remapping (Jones 1999), while all other datasets with similar or coarser resolution than 0.44 were remapped through bilinear interpolation as in Nikulin et al. (2012). The details of each model can be found in Table 1.
In addition to the overall circulation patterns over Southern Africa, within the study area three specific subregions were also examined to provide an overview of model performance across regions with nominally homogeneous rainfall annual cycle patterns (Liebmann et al. 2012). The extent of each subregion is selected based on GPCC-derived standardized annual precipitation cycles over Africa, which are distinguished into nine types (classes) using the k-means method. The location of the subregions and their topography is shown in Fig. 1. Subregion 1 experiences a mainly tropical to subtropical rainfall regime which is largely convective in nature and influenced by the position of the ITCZ. Rainfall results from mesoscale convective systems, tropical storms, and tropical cyclones that make landfall having come from the Mozambique Channel. Rainfall is highly seasonal and occurs in the austral summer with an annual peak in precipitation on average recorded in January-February. The southwestern part of this region is drier (Kalahari) than the northern and eastern parts. Subregion 2 lies in a subtropical rainfall regime and also experiences convective rainfall through the action of a semipermanent thermal low pressure system in the austral summer. Rain-producing systems include mesoscale convective systems, warm fronts, subtropical lows, mid-to uppertropospheric troughs, and cloud bands. Spatially, the western part of the region is drier than the eastern part as a result of the general central to westward location of the thermal low in the summer (trade winds coming from the Indian Ocean). Subregion 3 experiences a subtropical climate with wet conditions in late autumn and winter with the annual peak in precipitation in June-July. Rainfall results primarily from transient midlatitude low pressure systems like cold fronts and is largely stratiform and orographic. The coastal regions are wetter than the interior and there is less precipitation as one moves northward. For more details on rainfall and weather regimes in these regions, the reader is referred to Tyson and Preston-Whyte (2000).
A seasonal analysis is performed on area-averaged results, in order to assess the models' ability to reproduce the annual precipitation cycle. A quantitative evaluation of model performance is carried out using Taylor diagrams (Taylor 2001), which have been widely used to assess model results on the basis of the root-mean-square error, Pearson's correlation coefficient, and standard deviation (Taylor 2001;Covey et al. 2003).

a. Seasonal analysis for the three different climatic regions
The differences between model simulated and observed seasonal precipitation estimates are first assessed using area-averaged results for each of the three subregions shown in Fig. 1. Highlighting problematic seasons and areas will guide further analysis on circulation features that are likely responsible for precipitation biases in the model results.
Taylor diagrams (Taylor 2001) were used to assess the models' ability to simulate precipitation in each subregion (Figs. 2-4) using seasonal means of December-February (DJF), March-May (MAM), June-August (JJA), and September-November (SON). The results shown are based on the interannual variation of seasonal mean precipitation for the period 1990-2006. Each model is compared against GPCC observations using root-mean-square error (RMSE), Pearson's correlation coefficient and standard deviation (SD). To provide an overview of observational uncertainty, UDEL, CRU, GPCP, TRMM, and ERA-Interim are also compared to GPCC and plotted on the same diagram. Similar analysis (Taylor diagrams) conducted using UDEL and CRU datasets as reference (not shown) confirm that model performance is largely independent of the reference dataset. Where possible, ERA-Interim-derived rainfall is used to provide a measure of uncertainty between the reanalysis data and observations. The seasonal cycle results for each subregion area-averaged monthly mean precipitation values are shown in Fig. 5.
For subregion 1 (Fig. 2) large differences are observed in model performance across all seasons. The models are able to capture the phase of interannual variability better during the wet (DJF) and dry (JJA) seasons, where more than half of the RCMs are found to have correlation coefficients greater than 0.50 (statistical significance at the 5% level). The correlation is weaker for MAM and SON (Figs. 2b,d). In agreement with previous studies (e.g., Weigel et al. 2010;Nikulin et al. 2012), the ensemble mean is generally found to outperform the individual models. It performs better in terms of correlation coefficient and RMSE, though it tends to systematically underestimate the magnitude of the interannual variability relative to GPCC, except in SON. Furthermore, the correlation and RMSE between the ensemble mean and GPCC is similar to or better than the correlation and RMSE of the other observed datasets. Notable is the poor correlation between the UDEL and CRU observational datasets with the GPCC during SON and JJA, which may be attributed to spatial differences within the chosen subregion across the gridded datasets, discussed further in section 3b. Although the relative model performance, both in terms of area-averaged results and in terms of spatial precipitation patterns (discussed in section 3c) was found to be largely independent of the reference dataset; for these two seasons in subregion 1 a much weaker correlation coefficient between all model results and the UDEL and CRU datasets was found. For JJA a correlation below 0.1 was found for all models compared to UDEL and below 0.5 compared to CRU, whereas the correlation against GPCC is between 0.4 and 0.85.
On an individual model basis, the ARPEGE5.1 model consistently outperforms all models as it captures the interannual variation of seasonal precipitation very well for all seasons and has a similar (in one case better) correlation coefficient, RMSE and standard deviation compared to the ensemble mean. ARPEGE5.1 achieves a correlation with GPCC of above 0.82 for all seasons, a similar standard deviation to GPCC (though lower for all seasons except SON) and an RMSE of less than 0.4 across all seasons.
The reanalysis data compare well against GPCC with a correlation coefficient above 0.85 for all seasons, a standard deviation that is similar to that of GPCC, and an RMSE less than 0.42 across all seasons.
Subregion 2 (Fig. 3) shows large variation in model performance across all seasons. The correlation coefficient for all models compared to GPCC varies between 0.20 and 0.96, with individual values found to span the whole continuum between these two extremes. Models appear to perform slightly better for MAM, where eight models are found to have a correlation with observation of above 0.6, the standard deviation is close to that of GPCC, and the RMSE is less than 0.5 for all but two models. The seasonal cycle results for this subregion shown in Fig. 5 (area-averaged monthly mean precipitation values) confirm that the models have a realistic cessation of the rainfall season over this area. Similarly to subregion 1 results, the ARPEGE5.1 model consistently outperforms all models for all seasons and captures the interannual variation of seasonal precipitation very well, with a correlation coefficient of above 0.86 in all cases. It is found to have a standard deviation which is generally very close to or slightly less than that of GPCC, confirming that it is also able to capture the amplitude of the interannual variation. For all seasons the ARPEGE5.1 model is found to perform better than the multimodel ensemble and also has similar statistics compared to the other observational datasets.
There is generally good agreement between GPCC and the other observational and reanalysis data, with a correlation above 0.85 for all seasons and all datasets, a low RMSE, and a similar standard deviation.
Subregion 3 (Fig. 4) is characterized by winter rainfall as opposed to summer rainfall in the other two subregions. Most models are able to adequately capture the interannual variation of the winter season (JJA) rainfall in this region and all but one model have a correlation with the observations of greater than 0.60. The RMSE is less than 0.3 in all cases, and the standard deviation is found to be close to that of GPCC, implying that the Here, 10 model results, the multimodel ensemble, four observational datasets (UDEL, CRU, GPCP, and TRMM), and one reanalysis dataset (ERA-Interim) are compared against GPCC observational data. The radial coordinate gives the magnitude of total SD (mm day 21 ) (the solid radial highlights the SD of GPCC), the angular coordinate gives the correlation with observations (GPCC), and the distance between the observed point on the x axis and any other point is proportional to the RMSE (mm day 21 ) (gray radials). Note the scales of the SD and RMSE radials are not constant in all the images to allow for assessment of the correlation coefficient where SD and RMSE are small. models have a realistic amplitude over the western region of South Africa. Large differences in the models' performance are found for the dry summer period (DJF), where the correlation with observation ranges between 0 and 0.86 with individual values found to span the whole continuum between these two extremes, though the standard deviation is close to that of GPCC for all but three models. Models also appear to have a difficulty in capturing the interannual variation of seasonal precipitation for MAM, as only two models (ARPEGE5.1 and RACMO2.2b) have a correlation with observation above 0.5.
ARPEGE5.1 is again found to perform better than other models for subregion 3 and when compared to GPCC it has a correlation coefficient of above 0.70 for all seasons. However, although it performs better in terms of the correlation coefficient, for the dry DJF season in this region it has an unrealistically large standard deviation of 0.49 mm day 21 compared to 0.19 mm day 21 for GPCC. The multimodel ensemble is found to outperform individual models in DJF, despite the higher precipitation simulated by most models for these months (Fig. 5). This, as in other seasons and regions, is likely a function of the cancellation of error of each model in the ensemble mean. In other seasons the ensemble mean generally performs better than most models, with some models performing equally well (especially in JJA).
The correlation of the reanalysis precipitation with GPCC is found to be similar to that of the multimodel ensemble, the RMSE is found to be below 0.2 for all seasons and the standard deviation is found to be very close to that of the GPCC dataset with the exception of JJA where an underestimation of 40% is noted. There is generally good agreement between GPCC and the other observational data, with a correlation above 0.85 for all seasons and all datasets, though other datasets are found to have a somewhat lower standard deviation.
The seasonal cycle results in Fig. 5 (area-averaged monthly mean precipitation values) show that for subregions 1 and 2 there is an overall good agreement between the different observational datasets and the reanalysis data, and the models are able to capture the pattern seen in the observations. For subregion 3 there is good agreement between the CRU, UDEL, and GPCC observational datasets in terms of both the seasonal pattern and the magnitude of precipitation. The TRMM and GPCP data (satellite and merged satellite-gauge dataset, respectively) show lower precipitation compared to the other observations for the wet winter months. The characteristically lower precipitation for the month of June is due to the shorter time period used for these two datasets (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006), with all the other datasets also found to have similarly lower values during this shorter time period for this subregion. Precipitation in the reanalysis data for subregion 3 during the wet winter months is also found to be slightly lower than in the observations. The individual model results for subregion 3 in Fig. 5 show that a number of models simulate a higher precipitation during the dry summer months, which leads the ensemble mean pattern to display a 1-month lag in the onset of the rainfall season in this subregion with respect to the observations. A wet precipitation bias in the southwest of South Africa is also noted for some models in the spatial distribution of precipitation for DJF, which is discussed in section 3d (Fig. 8). The discussion in that section concludes that for a number of models the summer rainfall regime may penetrate too far southwest into the winter rainfall region, perhaps because of an overestimation of convective rainfall in summer over the inland part of the region, leading to higher precipitation estimates during these months. We also note the models show a peak in rainfall in June and not July as seen in the reference datasets (except GPCP and TRMM), which is also discussed further in section 3d.

b. Comparison of observational datasets: An estimate of observational uncertainty
The comparison of GPCC and other observational datasets in Figs. 2-4 shows that in all subregions there are seasons for which the area-averaged interannual variation of monthly mean precipitation across the different observational datasets can vary and in some cases the differences are large (e.g., JJA season in subregion 1). To study whether there are spatial differences within each subregion that lead to these differences in the area averages, the correlation between the variation of monthly mean precipitation in GPCC and other datasets is calculated and shown in Fig. 6 in the form of correlation maps (only values which are found to be significant at the 5% level are plotted). The limited time period of 1998-2006 is chosen so that comparison against the TRMM satellite-based dataset can also be included. Similar results are obtained using the period 1990-2006 (not shown). The results displayed in Fig. 6 show that there is overall good correlation between the variation of monthly mean precipitation of other observational datasets and GPCC, though there are areas within the chosen subregions for which the correlation is lower (albeit significant). Very similar results are obtained using TRMM as reference (not shown), confirming that GPCC does not show a systematic departure from other datasets across the region studied. A comparison of the standard deviations (Fig. 7) reveals differences of up to 1.5 mm day 21 in subregions 1 and 2 depending on the dataset, with TRMM having a systematically lower standard deviation in the northern part of the domain, as it is known to have a dry bias in this area (Nikulin et al. 2012).
Within subregion 1, in the area of northern Mozambique, most other datasets have a lower standard deviation compared to GPCC. There is a high correlation between the gridded GPCC, GPCP, and UDEL datasets and station data available for this region (see appendix). However, the GPCC data have a slightly lower RMSE compared to GPCP, and UDEL is found to have a very small RMSE. Precipitation biases between the observational datasets for individual seasons (not shown) show lower precipitation during DJF (and MAM) in the CRU, TRMM, and UDEL data compared to GPCC for this region (northern Mozambique), with differences reaching up to 3 mm day 21 . Although TRMM is known to have a dry bias in the area (Nikulin et al. 2012), it is likely that GPCC slightly overestimates precipitation for this particular region and the comparison against station data would imply that UDEL is closer to observations. However, it should be noted that this does not affect the discussion of model performance for this area of the domain (section 3c), as the precipitation biases in the model results are greater than the differences between the observed datasets and a number of models are found to underestimate precipitation over northern Mozambique for all seasons, irrespective of the gauge-based reference dataset used (CRU, UDEL, or GPCC; not shown here). For other seasons the differences in precipitation between GPCC and other gauge-based datasets across most of Southern Africa are generally small and TRMM is found to have a dry bias over the southwestern part of the domain, especially during JJA, and the entire northern part of the domain during most seasons.
It is not the aim of this paper to analyze these differences in detail, as this has been the subject of However, it is important to note these spatial differences and highlight that care needs to be made when analyzing area averages. For certain regions a number of factors, such as the quantity and quality of available station data, the different interpolation techniques, or (in the case of satellite data) the gauge analysis products used in the adjustments, can contribute to a higher uncertainty in terms of the reliability and validity of the datasets (Yin and Gruber 2010;Nikulin et al. 2012;Tapiador et al. 2012). The selection of reference data is becoming an important factor in model assessment as more ground-, space-, and reanalysis-based rainfall products become available, which all use different methods to produce precipitation data. For example, satellite-based rainfall datasets include among others the Tropical Applications of Meteorology using Satellite Data (TAMSAT; Grimes et al. 1999 , which use different algorithms to produce the respective rainfall products. Other products merge station and satellite data to produce precipitation data: for example, the Famine Early Warning System (FEWS) dataset; the TRMM 3B42 dataset, which is based on satellite data; the CAMS global gridded rain gauge data (Janowiak and Xie 1999); and the GHCN station data. The Water and Global Change (WATCH) dataset is based on reanalysis and station data (Weedon et al. 2011). It therefore becomes theoretically possible to achieve a reduction in rainfall bias of a particular regional model for a particular season over particular regions by selecting the closest observed precipitation product. We suggest, along with Nikulin et al. (2012), that downscaled rainfall data be validated against an ensemble of observational data so that the variability within these datasets can be taken into account.

c. Biases in seasonal precipitation patterns
Model results for precipitation across Southern Africa (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)) are compared to GPCC in Fig. 8 for DJF. The intercomparison of model results shows that model performance is similar across all seasons, so only DJF is shown, which is the wet season for most of the region. Model results were also compared against other observational datasets for the same time period (CRU and UDEL; not shown) and the results presented and discussed in this section apply irrespective of the reference dataset used.
When compared to GPCC, most models are found to undersimulate precipitation in the northeastern part of the domain and to a lesser extent in the north and northwestern parts (especially during MAM) and higher precipitation results are found in the southeastern part, especially over the Lesotho highlands and Drakensburg areas. The WRF3.1.1 RCM and to a lesser extent PRECIS have a wet bias over most of Southern Africa, though WRF3.1.1 shows a slightly different pattern as it also has a widespread dry bias in the northwestern part of the domain. HIRHAM5 and CCLM4.8 have the driest bias in the northeastern part of the domain. CRCM5 is found to perform best and has a spatial precipitation pattern which is largely consistent to that of GPCC for all seasons and the differences seen in Fig. 8 are on the same order or less than those seen in other models. An exception is JJA, where this model shows a slight wet bias of ;2 mm day 21 in the northeastern part of the domain when compared against any observational dataset, which covers the entire period 1990-2006 (GPCC, UDEL, or CRU). Overall, CRCM5 has results which are very close to the multimodel ensemble and depending on the season the model performs slightly better (DJF and SON) or slightly worst (JJA) than the ensemble model results.
The actual precipitation values in Fig. 8 show that ARPEGE5.1 has a similar performance to other models across the entire Southern African region, though it does not perform as well as the results in Figs. 2-4 would suggest. This is most probably due to error compensations when computing the area average values. In subregion 1 the DJF dry bias over northern Mozambique and eastern Zambia may be offset somewhat by the weak wet bias over most of Botswana. Similarly, in subregion 2 the dry bias over southern Mozambique and northern South Africa may be offset by the wet bias over and around Lesotho. In subregion 3 the wet precipitation bias in the southwest of South Africa is likely to lead to the unrealistically large standard deviation seen in Fig. 4 during the region's dry season.
The comparison of the spatial distribution of simulated precipitation between the models and the GPCC for JJA (not shown) confirms the results of Fig. 5 and shows that most models to a greater or lesser extent underestimate precipitation during the wet winter period in the southwestern tip of Africa. This finding is independent of the comparison dataset used. It is interesting to note certain regional details where, in addition to the large-scale dynamics, local topography is also expected to challenge the models. The undersimulation of precipitation in the northeastern part of the domain seen in most model results during DJF and MAM is more pronounced over Lake Malawi, a key geomorphological feature that needs to be appropriately resolved. However, it is admittedly challenging to capture the large-scale evaporation-condensation-precipitation cycle taking place over Africa's third largest lake at the spatial resolution defined in CORDEX. An underestimation of precipitation over Lake Malawi for DJF and MAM is also observed in ERA-Interim. In the southeastern part of the domain, all models to a greater or lesser extent have a wet bias over the Drakensburg Mountains/Lesotho highlands, where the complex topography associated with this geomorphological feature clearly poses a modeling challenge. The relatively poor results in subregion 3 will likely be influenced by the small size of the region and the relatively coarse 0.448 CORDEX resolution. The region has complex orography which is smoothed at this resolution and is also at a land-sea boundary. Furthermore, the southwest corner of the CORDEX Africa domain is close to the region so the models do not have enough domain space to freely simulate transient midlatitude cyclones that bring rainfall to the region.
As already discussed, the comparison of the different gauge-based observational datasets per season does not show a systematic tendency toward lower or higher precipitation estimates in the GPCC data for Southern Africa. The only main difference is found in the northeastern part of the domain, where precipitation in GPCC is somewhat higher (not shown), as already discussed in section 3b. Nevertheless, this does not affect the models' relative performance and the discussion of Fig. 8. The only difference is that in the multimodel ensemble results the dry bias over Mozambique is less pronounced when comparing against CRU and UDEL data, as these datasets indicate less precipitation in this area. Hence, the tendency toward lower/higher precipitation estimates in the model results is related to difficulties in modeling the larger circulation patterns and for particular areas the complex topography and its impact on the microclimate, and it is not influenced by biases in the observations.

d. Circulation features and links to precipitation
In Fig. 9, SLP anomalies and 850-hPa wind anomalies are used to examine the large-scale circulation over Southern Africa, as simulated by the different models.
We also attempt to relate these to the precipitation biases evident in some of the models.
The higher precipitation observed in the WRF3.1.1 and PRECIS results for DJF (Fig. 8) could be explained by the very deep low pressure bias over the land and the cyclonic circulation anomaly in the 850-hPa winds, which indicates increased moisture transport from the tropics and Mozambique Channel into the region. These factors, as well as well the creation of a generally more unstable atmosphere, are likely to result in the wet bias in these models.
In the northwestern part of the domain over Angola, a deeper low pressure bias is observed in some models compared to ERA-Interim. This enhanced Angola low pressure system leads to less moisture being advected into the Angola region (related to part of the cold Benguela Current that continues flowing north along the Namibian-Angolan coast), which leads to lower precipitation in the northwestern part of the domain.
In the southeastern part of the domain, the enhanced precipitation observed over the Lesotho Mountains is related to a stronger onshore flow than that observed in the ERA-Interim data. This can be seen in the 850-hPa wind anomalies in Fig. 9 for some models (WRF3.1.1, MPI-REMO, PRECIS, and ARPEGE5.1). For all models, this can more clearly be seen in the actual 850-hPa wind fields and streamline plots (not shown). The stronger onshore flow transports more moisture inland, which results in increased orographic precipitation over areas with higher elevation.
The differences in the simulation of the large-scale circulation that most likely lead to an underestimation of precipitation in the southwestern tip of the domain (winter rainfall region) during JJA compared to the GPCC (seen to a greater or lesser degree in all models), are most probably linked to a less pronounced northward shift of the westerlies during winter and possibly also a weaker magnitude of these overall. It is likely that this is also the case in the ERA-Interim data (an underestimation of precipitation is also noted in ERA-Interim compared to the other reference datasets), and therefore the anomaly maps for JJA cannot be used to confirm this. However, from Fig. 5 it is evident that over this region the all models place the rainfall maximum in June instead of July. Sea level pressure for each model, as well as the model ensemble average, was compared against ERA-Interim SLP for June and July. We found that the subtropical westerly belt has a maximum northward displacement during June instead of July, which is also associated with a large high pressure anomaly over the interior of the country during July. The semipermanent high pressure system over most of Southern Africa during winter is caused by the descending limb of the Hadley circulation.
This would suggest the models are oversimulating the strength of the Hadley circulation during July, which results in a descending limb inhibiting the northward movement of the westerlies during this month, and therefore the maximum rainfall peak is produced in June, when the descending limb is weaker. More investigation will be dedicated to winter transient rainy systems, such as cutoff lows (Favre et al. 2013) in the CORDEX outputs in future studies.
During DJF there is a wet precipitation bias in the southwestern tip of the domain, which is likely a function of an anomalously deep subcontinental low pressure over the region being produced by the models. In Fig. 9 this is apparent in five of the models as well as the ensemble mean, indicating that the extent of the subtropical low is too far south. Wind anomalies at 850 hPa in these models suggest that too much moisture is advected into the region either from the interior of the country (northeast) or from the Agulhas Current (east). The combination of a low pressure system intruding into the region from the interior as well as moisture advection could introduce the wet rainfall bias through convective precipitation. In further studies, moisture data will be used to examine this hypothesis.
There is a dry bias noted in the model results over Malawi and in the northern part of Mozambique during the rainy season. Two mechanisms for this are proposed here; however, because of data constraints we were unable to investigate these further. First, of the models that show a strong dry bias (CCLM4.8, ARPEGE5.1, HIRAM, RACMO2.2b, and MPI-REMO), there are indications of a negative pressure bias at the surface west of the dry bias in CCLM4.8, MPI-REMO, and to some degree HIRAM as well as enhanced subtropical easterlies (trades) at 850 hPa in the latter four models. This may shift the ITCZ westward (during summer in this region the ITCZ has a north-south orientation), which is evident to some degree in the rainfall fields. Second, models tend to overestimate precipitation north of Madagascar over the sea and most models also overestimate precipitation in the Mozambique Channel (not shown). This would likely remove moisture from the atmospheric column that is necessary for rainfall over northern Mozambique. Unfortunately, because of data constraints described above, we could not test the model's ability to capture stability, vorticity, and moisture transport. It is interesting to note that similar precipitation bias patterns are found in the ERA-Interim data compared to GPCC.
e. Interannual variability of summer precipitation associated with the ENSO A regional impact of El Niño-Southern Oscillation (ENSO) on Southern African rainfall has been elucidated in a number of studies. In particular, El Niño warm events are found to favor seasonal droughts in South Africa (Lindesay 1988;Van Heerden et al. 1988;Reason et al. 2000), especially since the 1970s (Richard et al. 2000;Richard et al. 2001), and La Niña cold events are linked to a positive rainfall anomaly (Nicholson and Selato 2000). The season where the ENSO influence is found to be greatest varies across the region (Nicholson and Selato 2000), but since it is not the main aim of the paper to study the influence in detail the results presented here focus on DJF, the main rainfall season for most of Southern Africa.
El Niño composite plots and La Niña composite plots are shown in Figs. 10 and 11, respectively. In Fig. 10, there is overall good agreement across the different observational datasets and ERA-Interim. In Fig. 11, the agreement between ERA-Interim and observations is still good, though in the reanalysis data the extent of the wetter conditions during La Niña events in South Africa is perhaps too large. The strength and extent of the negative precipitation anomaly in the northwestern part of the domain seen in the GPCC data, to a lesser extent in the UDEL data, and only slightly in the CRU data are possibly due to a bias in the observations and the uncertainty associated with the few station data available in this area (see Nikulin et al. 2012).
To a greater or lesser extent, all models are found to capture the drier conditions observed during El Niño in central and eastern Southern Africa (Fig. 10). Both ERA-Interim and the model results however tend to show a positive precipitation anomaly over southern Angola (only slightly observed in the observations), which in the model results is also found to extend farther south into Namibia. All models capture the wet precipitation anomaly during El Niño in northern Mozambique, though in some cases the magnitude and the extent of the anomaly in the model results shows a dryer bias (e.g., RegCM3 and RCA3.5). The strength of the wet precipitation anomaly over northern Mozambique is somewhat reduced in the ensemble mean results and is similar to that of ERA-Interim.
In Fig. 11, the models are found to capture the overall positive precipitation anomaly in eastern Southern Africa during the La Niña events, as seen in the observations. However, in certain areas some models do not capture the magnitude or the sign of the wetter conditions (e.g., MPI-REMO, PRECIS, and RCA3.5 over Zimbabwe and in some cases Botswana). A number of models also undersimulate the positive precipitation anomaly in southern Mozambique and in some cases even have an opposite sign. The negative precipitation anomaly expected in the northeast of the domain is very close to the domain's boundaries and also extends beyond it; it is thus not possible to analyze it here. Further south in the domain, the southward spatial extent of the wetter conditions over South Africa during La Niña events is found to be greater in all models compared to the observations. The model MPI-REMO also shows a dry bias over Zambia, which is not seen to such an extent in the observations. The common differences in the model results compared to the observations also reflect in the multimodel ensemble mean, which compared to GPCC shows a weaker wet precipitation anomaly over southern Mozambique, a spatially extended wet precipitation anomaly over South Africa, and very small precipitation changes over Zimbabwe and northern South Africa.

a. Observation datasets
The GPCC rainfall dataset was used as the reference dataset, although it is recognized that the area-averaged interannual variation of monthly mean precipitation across the different observational datasets varied and in some cases this variation was large. In some regions the observational differences were as large as the multimodel ensemble mean bias. This highlights the challenge in Africa of the lack of high quality observation datasets at suitable temporal and spatial resolution necessary for evaluating RCM simulations. With the introduction of many more satellite-based precipitation datasets as well as satellite/reanalysis-station merged observed datasets, the selection of observed datasets will likely play a role in assessing model bias. We concur with the suggestion of Nikulin et al. (2012) to consider an ensemble mean approach to observational datasets to address this; however, this may limit the observational period to post-1998 to include the satellite-based TRMM and GPCP data.

b. Model biases
Regional model studies often involve tuning a regional model so that it has an optimal configuration over particular spatial domain to produce the best results with respect to some observed dataset. They can be tuned in terms of domain size, lateral boundary condition (LBC) forcing, horizontal and vertical resolution, and the atmospheric and surface physics (see Cr etat et al. 2012). Some model setups have even been found to be seasonally sensitive (Ishak et al. 2012). Usually these types of studies focus over a region of interest, or a country or FIG. 11. Precipitation anomaly maps (mm day 21 ) for DJF associated with La Niña events (composites using data for 1995/96, 1998/99, and 1999/00 and average for 1990-2006) as simulated by CORDEX RCMs, compared against the observational datasets (GPCC, UDEL, and CRU) and the reanalysis data (ERA-Interim). Negative anomalies are indicated in orange, and positive anomalies are shown in blue.

J O U R N A L O F C L I M A T E VOLUME 26
local area and the models are optimized to get the most accurate results for this location and for a specific season.
In studies over Africa, the African Monsoon Multidisciplinary Analysis (AMMA) program has focused over West Africa using models including RegCM3, WRF3.1.1, and ARPEGE5.1 in its experiments (Hourdin et al. 2010;Sylla et al. 2010;Flaounas et al. 2011;Mariotti et al. 2011;Paeth et al. 2011;Ruti et al. 2011 (Kgatuke et al. 2008;Sylla et al. 2012); and a stretch grid global model configured to run at a regional scale, the Conformal-Cubic Atmospheric Model (McGregor 2005a,b;Engelbrecht et al. 2009Engelbrecht et al. , 2011. Over East Africa, models used to examine climate include RegCM2 (Song et al. 2004), RegCM3 (Anyaha and Semazzib 2007; Davis et al. 2009), WRF3.1.1 (Zhang 2007;Pohl et al. 2011), and the CLM (Kaspar and Cubasch 2008). However, within the CORDEX framework (see Giorgi et al. 2009;Jones et al. 2011), the domain size, LBC, and horizontal resolution are prescribed so only vertical resolution and model physics can be optimized. The vertical resolution has been shown to not contribute to rainfall biases as much as the cumulus scheme (Gianotti et al. 2012); therefore, the model's internal variability and selected physics packages are the primary sources of bias, of which the selection of cumulus scheme has been shown to be of major importance. The choice of cumulus scheme can change the sign of the rainfall bias (Tadross et al. 2005;Liang et al. 2007), spatial distribution of rainfall (Liang et al. 2004b;Jankov et al. 2005;Zanis et al. 2009), and diurnal timing (Liang et al. 2004a;Wang et al. 2007;Nikulin et al. 2012).
The African continent is the second largest landmass in the world and spans climate regimes from equatorial to Mediterranean and whose climate controls are mainly mesoscale and teleconnective. Models simulating the CORDEX African domain are therefore required to both propagate the large-scale teleconnective controls into the domain and capture the finer-scale convective controls. As convective activity is the dominant source of rainfall over Africa in the equatorial, tropical, and wet summer subtropical regions, convective parameterizations are a significant source of variability within models as well as between models. The challenge posed to the regional models within the CORDEX framework then is that they cannot be optimized for a particular region or climate regime or time period.
In the 10 models assessed in this paper, two convective schemes were prevalent, the Tiedke (HIRAM, CCLM4.8, RACMO2.2b, and MPI-REMO) and WRF3.1.1,and CRCM5). Those schemes were not used for the PRECIS (Gregory and Roundtree), RegCM3 (Grell and Fritsch-Chappel closure scheme), and ARPEGE5.1 (Bougeault) simulations (see Table 1). Nikulin et al. (2012) examined these schemes using frequency and intensity of 3-hourly precipitation to verify the diurnality of rainfall over West Africa. Observations show the highest frequency of heavy rainfall to occur in the late afternoon and evening with high frequencies of light rainfall either side of noon. However, most of the models, regardless of the cumulus scheme used, generally produce heavy rainfall too early in the day (from late morning to late afternoon) and light rainfall throughout the night. This was also true for the ERA-Interim rainfall, which is produced by the ECMWF model that uses the Tiedke scheme. Only two models, which both used the Kain-Fritsch (KF) scheme, showed results different to the others. CRCM5 produced rainfall frequencies that resembled observed data, whereas the RCA models had a very flat diurnal cycle implying light rainfall all day. The WRF3.1.1 produced results similar to the Tiedke family models.
Although these results were over west-central, and East Africa it is not unreasonable to propose that a similar phenomenon would be expected over Southern Africa. There are major differences between the Tiedke and KF schemes, which lie in their closure and trigger functions. The Tiedke trigger function is based on an environmental temperature threshold exceedance and the closure function is based on moisture convergence. The KF convection trigger is based on large-scale vertical velocity and the closure function on the removal of convective available potential energy (CAPE). These are fundamentally different techniques used for parameterizing convection.
The different results between the three models using the KF scheme may lie in the closure assumption and/or in the PBL scheme. The PBL scheme helps set the stability of the environment and the closure scheme removes CAPE from the air column. Nikulin et al. (2012) suggest two reasons for models using the KF scheme to produce relatively better results: namely, an advanced trigger function and entrainment and detrainment processes that are responsive to environmental conditions through a buoyancy sorting approach (Kain and Fritsch 1990). The WRF3.1.1 simulation, which had a wet bias, used the Yonsei University (YSU) PBL scheme (Hong et al. 2006), which was found by Cr etat et al. (2012) to be a scheme suitable for use over this region. This scheme has been found to produce more realistic boundary layer characteristics in terms of vertical mixing and moisture and heat entrainment (Hu et al. 2010). The latest version of the KF scheme was used in which the closure method rearranges mass in the column using the updraft, downdraft, and environmental mass fluxes until at least 90% of the convective available potential energy is removed. In this latest version, the calculation of CAPE is based on the path of an entraining parcel in the column, so the parcel can be diluted as it ascends. The RCA3.5 simulation used an earlier version of the KF scheme where the parcel ascends without entrainment (dilution) so more CAPE would have to be removed in the closure. The CRCM5 simulation (Hern andez-D ıaz et al. 2013) uses a modified KF scheme called the Bechtold-KF scheme, which adjusts the trigger function and closure function of shallow convection (Bechtold et al. 2001;Yanjun and Jones 2008).
There is much literature that shows wet rainfall biases introduced by the cumulus parameterization have been result from an overestimation of low intensity rainfall (e.g., Dai and Trenberth 2004;Wang et al. 2007;Im et al. 2008;Neelin et al. 2008;Gianotti et al. 2012). Gianotti et al. (2012) have suggested that the oversimulation of drizzle is the result of convection rainfall being initiated too frequently in the model, which is a function of the threshold criteria for triggering convection and the creation of environmental conditions to meet these criteria. In the latter, the PBL scheme plays a critical role in setting up the unstable environment to trigger convection. They also suggest that convection triggers have been formulated for coarser space and time resolutions and the uniform threshold criteria that trigger convection at these spatial scales may not be appropriate at higher resolutions. Neelin et al. (2008) suggest that some form of stochasticity be incorporated into the triggering of convective activity as one method of introducing the necessary variability.
The general wet bias over the central and southeastern parts of the domain is not large (with the exception of WRF3.1.1 and perhaps PRECIS), and the ensemble average bias is small, except over the south and southwest parts (South Africa, Namibia, and the south part of Botswana), where the ensemble annual mean overestimates rainfall amounts at about 50% of the observed annual mean (also see Kim et al. 2013). The WRF3.1.1 used the KF cumulus scheme and the YSU PBL scheme, which have been tested over the region with good results by Tadross et al. (2005) Solman and Pessacg (2012) show the KF scheme to be most suitable in subtropical regions, so it is unclear to the authors why WRF3.1.1 should introduce such a wet bias to the region. With the exception of WRF3.1.1, the models produce a dry bias over the northeast of the domain more specifically over Mozambique and the south part of Malawi (Nikulin et al. 2012;Kim et al. 2013), which is more likely related to circulation anomalies that shift the south Indian convergence zone west (Cook 2000) than directly to the cumulus parameterization.
However, it must be noted that cumulus schemes are not the only source of rainfall bias in region. Tchotchou and Kamga (2010) caution in their study of the West African monsoon using RegCM3 that unrealistic soil moisture resulting from the way albedo has been parameterized was a leading cause of an excessive northward penetration of monsoon flow. Solman and Pessacg (2012) attributed the underestimation of precipitation over the La Plata Basin by the MM5 to be largely due to a misrepresentation of the moisture flux convergence. Furthermore, the driving boundary conditions can also affect model bias if these constrain the model physics. We tested the potential influence of the ERA-Interim driving data on the bias through correlating each models output and the ensemble average with the ERA-Interim rainfall for each season. For MAM, JJA, and SON, the ensemble average correlation was high (between 0.6 and 0.9) across most of the domain except over parts of Namibia, where there was very low correlation. On a model by model basis, however, correlations were spatially heterogeneous and ranged between 20.6 and 0.9 so the ensemble result is likely an error cancelation feature. During DJF, correlations were low in both the ensemble average (from 20.4 to 0.6) and the individual models (20.8 to 0.6). This suggests that the in the nonpeak rainfall seasons the boundary conditions could be a nonnegligible driver of the rainfall biases over the region. In the core rainfall season, however, the model physics are likely be a dominant source of the bias through the convective schemes.
There are thus many potential sources of bias in each of the CORDEX models: the internal solvers of the models themselves, the physics packages like the PBL, and cumulus schemes as well as their subschemes like trigger, closure, and entrainment functions. Additionally, the representation of the land surface processes and circulations will also have an influence on the model rainfall bias. However, the ensemble average smoothes out these biases and is generally outperforms any particular ensemble member. We suggest here, along with others, that an ensemble of differently set up models with different parameterization schemes usually provides superior performance over the best of each because distinct regions exist where one scheme outperforms the other so that the ensemble becomes complementary in capturing the observed climate (Liang et al. 2007;Liu et al. 2009;Liu et al. 2010).

Summary and conclusions
In this study, the CORDEX RCMs were analyzed for their ability to capture and characterize rainfall patterns over Southern Africa region during the period 1990-2006. The ensemble consists of 10 different RCMs all run at ;50-km resolution, on a common grid and driven by ERA-Interim for the period 1989-2008. Data were provided on monthly time scale. Performance of the individual models and the ensemble average is evaluated seasonally in terms of correlation, variability, and root-mean-square error in three subregions that spanned tropical, subtropical, and midlatitude rainfall regimes. Model precipitation biases were examined spatially, and an attempt was made to relate these biases to circulation biases. We also investigate the rainfall response to the ENSO signal.
We used the GPCC data as the reference dataset for an assessment using Taylor diagrams in three areaaveraged regions defined by rainfall regime. This showed that in all three regions the ensemble average outperformed individual model results in all seasons for correlation, standard deviation and RMSE. In subregion 1, which primarily experiences tropical and subtropical rainfall, during the wet season (DJF) the models are able to capture the phase of interannual variability well and the ensemble mean is generally more closely correlated with the GPCC than the other observational datasets. However, most models tend to systematically underestimate the magnitude of the interannual variability relative to GPCC. In subregion 2, a primarily subtropical rainfall regime, the spread between the observational datasets is smaller in all seasons. As seen in subregion 1, there is also a systematic underestimation in the magnitude of the interannual variability relative to GPCC by most models in the wet season; however, here the ensemble mean has a lower correlation with the GPCC. In subregion 3, which experiences winter rainfall, models again underestimate the magnitude of the interannual variability, a characteristic shared with the other observational datasets, but there is good correlation between the ensemble mean and the GPCC. In terms of individual models in this analysis, ARPEGE5.1 generally performed better than other models and in many cases was comparable to the ensemble mean; however, the ensemble mean generally outperformed the results from individual models. It is also noted that the ensemble mean improves on the ERA-Interim in the north of the domain in removing the wet bias over Angola and lowers the magnitude of the dry bias over northern Mozambique. However, there is no improvement over the ERA-Interim rainfall over Lesotho and the surrounding highlands.
Model biases were assessed across the Southern African domain and during the wet season (DJF) most models undersimulated precipitation in the northeastern region of the domain. Higher precipitation biases are found in the southeastern parts of the region, especially over regions of high topography. In the winter rainfall region most models to a greater or lesser extent underestimated precipitation and all timed the maximum rainfall month too early likely because of a misrepresentation of the descending limb of the Hadley cell during July. Although different models exhibited different biases, the multimodel ensemble mean again outperformed individual models.
Model biases were related to large-scale circulation anomalies and where positive rainfall anomalies were noted, corresponding low pressure biases and moisture transport into the region could be inferred. Over regions of high topography (e.g., Lesotho, Drakensberg) most models had a wet rainfall bias during DJF, which is likely a result of a combination of orographic uplift and moisture flow into the region. Dry biases over the northern parts of the domain were evident in 7 of the 10 models, and these biases may be related to a westward shift of the ITCZ through enhanced trade winds and surface pressure anomalies. It is also likely that in regions around Lake Malawi the moisture cycle is not properly resolved. Anyah and Semazzi (2007) showed that the coupling of a three-dimensional lake model with the regional model greatly improved rainfall simulations over the Lake Victoria region.
The capability of CORDEX RCMs to capture the seasonal variability in precipitation related to ENSO events was also assessed. The models are found to simulate the drier than average conditions observed during El Niño events over most of Southern Africa reasonably well, though the magnitude and the extent of the wetter conditions observed in the northern part of Mozambique is somewhat reduced. The positive precipitation anomaly in the central and eastern part of the domain observed during La Niña events is also captured by most models, though in some areas the full magnitude of the wetter conditions is not simulated. Over South Africa the spatial extent of the wetter conditions observed during is La Niña years is exaggerated in almost all models.
On average all RCMs capture the main features of the seasonal mean rainfall distribution and the seasonal and annual cycle. However, large biases can be found in individual models depending on the region and the season, as has also been found in earlier multimodel studies (Druyan et al. 2010;Paeth et al. 2011). Nikulin et al. (2012) point out that for many of the downscaling groups this was the first time downscaling over Africa where land-convection interactions play a leading role in defining details of the precipitation climate. The multimodel average generally outperforms any of the individual models with biases of similar magnitude to differences between the observational datasets and in some regions is lower than the reanalysis bias. This is largely a result of cancellation of opposite signed biases across the models and highlights the importance of working in the context of multimodel ensembles when simulating and assessing Southern African climate characteristics. The biases could be related to some degree to large-scale circulation features, but this was not properly quantified because of the unavailability of relevant data. We also relate the largest potential source of bias to the internal physics of the models including cumulus, PBL, and land-use representations.
We have presented the results of the CORDEX simulations for Southern Africa and posited some potential sources of the rainfall biases seen. As downscaling groups begin to publish details on the model runs, as, for example, Hern andez-D ıaz et al. (2013) have recently done in their assessment of the CRCM5, we hope that the reasons for these biases will be more clearly elucidated. It is crucial to understand these as much as possible as the group of RCMs employed in this study and others will soon perform transient climate change downscalings within CORDEX, forced by coupled GCM results from CMIP5. In the near future these data will be made available through a data repository and will be in a common format (see Giorgi et al. 2009;Jones et al. 2011). We conclude that the regional models used here can be used for the assessment of future climate projections and also suggest that, through the ensemble mean, they are likely to improve on raw GCM diagnostic output such as rainfall as well as their prognostic drivers. from the South African Weather Service (SAWS) and the Instituto Nacional de Meteorologia (INAM). Mean monthly precipitation data, derived from daily records for the period 1990-2006 from 428 stations in South Africa and Mozambique, are compared against two gauge-based gridded observational datasets, GPCC and UDEL, and the GPCP satellite-gauge dataset. Station monthly mean data was exclusively computed for months without missing daily values which yielded generally high mean data availability ranging from 76% to 100% with an average of 98%.
The closest grid point used in the comparison is determined by the absolute minimum distance to the station. There is good agreement for all gridded data, with mean correlations of 0.89 for GPCC, 0.82 for GPCP, and 0.78 for UDEL (Fig. A1). The mean RMSE (mm day 21 ) is found to be lowest on average for UDEL (0.29) and slightly higher for GPCC (0.49) and GPCP (0.67).
The GPCC and UDEL datasets partly share the same gauge stations and differences can arise from the different processing algorithms and the levels of data availability for given time periods. We note that the higher RMSE associated with lower correlation for some stations on the north side of the bay of Maputo in GPCC may be a result of occasional mismatching of extremely rainy months. As an example, November 2000, among all November months for 1990-2006, shows the relative largest difference between GPCC and stations with a strong underestimation of rainfall by GPCC over this sector. UDEL also strongly underestimates rainfall for this month while GPCP captures the peak.
Compared to other regions of the world, there are very few stations in certain regions of Southern Africa, which makes it difficult to quantify the uncertainties between the different observational datasets. Where there are areas in the region that have fewer stations that can be used in the production of the GPCC product the uncertainties would be larger with respect to other reference datasets. For the density of in situ rainfall records used by GPCC, please see Fig. 3 in Nikulin et al. (2012).
This example demonstrates that the gridded products generally represent rainfall over South Africa and Mozambique well, as they partly share the same gauge stations. However, local differences do occur in areas where there are fewer stations and these differences can be important.