Surface air temperature, precipitation, and insolation over the conterminous United States region from the North American Regional Climate Change Assessment Program (NARCCAP) regional climate model (RCM) hindcast study are evaluated using the Jet Propulsion Laboratory (JPL) Regional Climate Model Evaluation System (RCMES). All RCMs reasonably simulate the observed climatology of these variables. RCM skill varies more widely for the magnitude of spatial variability than the pattern. The multimodel ensemble is among the best performers for all these variables. Systematic biases occur across these RCMs for the annual means, with warm biases over the Great Plains (GP) and cold biases in the Atlantic and the Gulf of Mexico (GM) coastal regions. Wet biases in the Pacific Northwest and dry biases in the GM/southern Great Plains also occur in most RCMs. All RCMs suffer problems in simulating summer rainfall in the Arizona–New Mexico region. RCMs generally overestimate surface insolation, especially in the eastern United States. Negative correlation between the biases in insolation and precipitation suggest that these two fields are related, likely via clouds. Systematic variations in biases for regions, seasons, variables, and metrics suggest that the bias correction in applying climate model data to assess the climate impact on various sectors must be performed accordingly. Precipitation evaluation with multiple observations reveals that observational data can be an important source of uncertainties in model evaluation; thus, cross examination of observational data is important for model evaluation.
With the confirmation of the occurrence of climate change induced by anthropogenic greenhouse gases and associated feedback within Earth's climate system, developing policies to ensure sustainable development in the future has become a worldwide concern. With it, assessing the impacts of the anticipated climate variations and change on regionally important sectors is growing in their importance as it provides scientific resources to policy makers in decision making for mitigating and adapting to these impacts (Houghton et al. 2001). Typically, assessing the impact of climate change on specific sectors is achieved by driving sector-specific assessment models (e.g., surface hydrology, agriculture, various ecosystems) using forcing data from climate model outputs representing the present and future climates as illustrated schematically in Fig. 1.
It is important to recognize that the only quantifiable and objective information on future climate stems from projections by physically based, multicomponent numerical climate models, now often referred to as Earth system models (ESMs) (e.g., Abiodun et al. 2008; Pollard and Thompson 1992). These ESMs calculate the physical and dynamical processes and interactions within and between various components of Earth's climate system including the atmosphere, hydrosphere, lithosphere, cryosphere, and biosphere. However, all climate models, as well as simulations in which they are used, suffer from incompleteness in their formulations due to the lack of 1) understanding of the underlying processes, 2) observations to correctly prescribe initial states, and 3) computational capacity combined with challenges to robustly yet economically coding up the aspects of the processes and model we do understand; this incompleteness results in model errors. Because of the critical role such models play in projecting future climate and assessing its impacts (Solomon et al. 2007), it is a high priority to bring as much observational scrutiny to these models as possible (e.g., Nature 2010). Figure 2 provides a schematic illustration of the flow of information, hierarchically from left to right, showing (a) detection and characterization of the past climate variations and change, (b) global model development and evaluation, (c) the projection of future climate, (d) downscaling and/or the assessment of the impact of climate variations and change on specific sectors, and (e) decision making to mitigate and/or adapt to the impacts. In particular, climate model evaluation, both for global climate models (GCMs; Fig. 2a) and regional climate models (RCMs; Fig. 2d), is among the especially important steps in the process. For RCMs in particular, model evaluation is used not only for model development and improvements but also for assessing and correcting model biases. Model evaluations are also used to weight individual models in multimodel ensembles, alleviate the effect of model error on assessment models, and estimate the range of uncertainty in projected impacts.
The long history of GCM analyses for assessments and other climate variability issues has resulted in a mature process of model experimentation and evaluation (e.g., Houghton et al. 1996, 2001; Solomon et al. 2007; Bader et al. 2008). Gleckler et al. (2008b) have evaluated the fidelity in simulating the present-day climate of multiple GCMs that have contributed to the archives of phase 3 of the Coupled Model Intercomparison Project (CMIP3) used for the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4) (Meehl et al. 2007). This study is among the first (cf. Murphy et al. 2004; Reichler and Kim 2008) that attempted to systematically and quantitatively score GCM-simulated global climate fidelity across the CMIP archive for multiple GCMs, model parameters, and metrics. These evaluations, and their refinements, are crucial for quantitative weighting of projections of future climate, with weightings that may depend on the key processes relevant to the given decision-support goal (e.g., Gleckler et al. 2008b; Dosio and Paruolo 2011; Gangopadhyay and Pruitt 2011; Grigory et al. 2012; Giorgi et al. 2009; Mearns et al. 2012b; Kim et al. 2013b). It must be noted that all weighting schemes, regardless of their principles, are highly subjective and a subject of further research.
Systematic multimodel RCM experimentations and observation-based evaluations are much less mature than those for GCM studies. Given RCMs' important role in regional climate change impact assessments that are frequently the scientific basis for developing policies to mitigate and adapt to these impacts, it is critical that a framework be established to bring to bear as much observational scrutiny as possible to this process. Note that even with the expected increases in computer resources that will allow GCMs to run at higher horizontal resolutions, RCMs will remain essential to the processes for regional climate projections, climate change impact assessments, and policy making for the foreseeable future.
This study evaluates three RCM variables—precipitation, surface air temperature, and surface insolation—that are important for surface hydroclimate and widely used in impact assessment studies using the Regional Climate Model Evaluation System (RCMES; Crichton et al. 2012; Hart et al. 2011; Kim et al. 2013b). A brief description of the current and planned capabilities of RCMES is presented in section 2. Section 3 presents the evaluation of these three fields and is followed by conclusions and discussions in section 4.
2. The Regional Climate Model Evaluation System
Apart from their use in characterizing and understanding climate variability, observed data are fundamental to model evaluation. Moreover, as model resolution has increased and models have become more complex, the demands for observations have increased dramatically. For example, today's fine-resolution RCMs require measurements with horizontal resolutions on O(1–10 km) for evaluation. In addition, most of today's RCMs incorporate and couple a variety of processes within the climate system, and thus require observations of a range of variables. Such evaluations are sorely needed as models, such as GCMs and RCMs, play an ever more important role in guiding our adaptation and mitigation paths associated with climate change. Taken together, the demands from higher resolution and multivariate evaluation make the scientific and logistical process of model evaluation ever more challenging (e.g., Overpeck et al. 2011). These challenges include not only bringing together massive amounts of observational and model data, but also dealing with the wide variety of sources and formats of data, necessitating significant investments in computer and personnel resources to transfer, decode, (re)format, (re)archive, and analyze the data. Such steps can make the process of performing robust model evaluations extremely difficult and time consuming even for highly trained scientists. A number of these steps could be greatly facilitated by having a flexible and coupled database and analysis to alleviate much of the burden in handling these massive datasets and allow scientists to spend more time on performing comprehensive and robust model evaluations. To help address these needs, including developing both the technical capabilities and performing robust model evaluations, RCMES has been developed through collaboration between the Jet Propulsion Laboratory (JPL) and University of California Los Angeles (UCLA). RCMES (Fig. 3) is a combined database/toolkit designed to facilitate the access to observational data and a basic computational toolkit for handling the database and calculating statistical metrics with special emphasis on handling a variety of remote sensing data that are and to be available today and in the future (e.g., Gleckler et al. 2008a). The system has complete end-to-end capabilities in terms of ingesting observational and model data, performing subsetting and formatting compatibility steps, calculating evaluation metrics, and plotting and/or outputting the results. Its inherent design makes it easily extensible for including more observations, diagnostics, and plotting/outputting capabilities. Details of the computational, both software and hardware-related, aspects of RCMES are presented by Hart et al. (2011) and Crichton et al. (2012) and only a brief outline of RCMES is presented below.
RCMES is composed of two main components, the Regional Climate Model Evaluation Database (RCMED) and the Regional Climate Model Evaluation Toolkit (RCMET). RCMED can reside on a single server or be distributed on multiple servers to allow efficient data management and sharing while reducing the hardware and software burdens for handling the data storage and traffic. It ingests data in various formats from various sources to extract metadata and translate the main body of a dataset into a common data format to be stored in RCMED. Once stored within RCMED, the data can be readily retrieved by RCMET for processing. The current system uses a sophisticated data management system comprising Apache OODT (Mattmann et al. 2006, 2009), a MySQL database back-end, and the Apache Hadoop technology (White 2009) to store and allow querying of the observational datasets. The data catalog for RCMED is continuously updated and is posted in the RCMES web site (http://rcmes.jpl.nasa.gov).
The RCMET is the main user-end of RCMES composed of a library of Python scripts and is installed on users' local systems. It retrieves the reference data, usually from observations, reanalysis, and/or assimilations, and model data from RCMED and the user's own archive, respectively; regrids both data onto a common grid spatially and temporally; calculates metrics for model evaluation; and produces visualizations of the results. The RCMET communicates RCMED to retrieve the reference data via a web interface. This method of linking RCMED and RCMET enables users to access database items located at more than one data system and institution. That is, the hardware required for hosting RCMED and data traffic between data servers and users can be distributed to avoid massive infrastructure and easier future growth. This allows the system to grow through collaborations among institutions with minimal investments in both hardware and manpower. In addition to the access to the database, RCMET includes a software suite for calculating statistical metrics popularly used in model evaluations and visualizations. Model-evaluation metrics and visualization generally vary widely according to users and targets; some users often invent their own metrics to fit their specific analysis and scopes. Thus, it is not practical, if not impossible, to include all metrics calculations and visualization within RCMET despite the fact that it is continuously updated to include additional metrics and visualization widely used for model evaluation. For added flexibility, RCMET also includes the capability to incorporate user-defined metrics as well as pathways to extract partially processed data (e.g., both model and reference data regridded onto a common grid) so that users can do their own specific data processing and visualizations.
3. Evaluation of NARCCAP model simulation fidelity
The surface air temperature, precipitation, and surface insolation over the conterminous U.S. region for the 24-yr period 1980–2003 simulated by multiple RCMs (listed with full expansions in Table 1) that contributed to the NARCCAP hindcast experiment are evaluated. These three variables have been chosen because they represent key atmospheric variables in shaping the surface hydrology and atmosphere–land interaction. NARCCAP is an international program that was designed to serve the high-resolution climate scenario needs of the United States (U.S.), Canada, and northern Mexico, in time-slice experiments using RCMs nested within GCMs to form multimodel ensemble climate scenarios over these regions (Mearns et al. 2009, 2012b). The primary application of these experiments is to investigate uncertainties in regional-scale projections of future climate and generate climate change scenarios for use in impacts research. These experiments represent the main systematic, multi-RCM climate simulation and projection resource for the assessment of the climate change impact over most of North America.
NARCCAP is composed of a set of RCM simulations using the large-scale forcing data from an atmospheric reanalysis (climate hindcast) and GCMs (climate scenarios) over a domain covering the North America region including Canada, the conterminous United States, and northern Mexico (http://www.narccap.ucar.edu). In the hindcast experiment, the RCMs are driven with the large-scale forcing data from the National Centers for Environmental Prediction (NCEP) Reanalysis II for the period 1979–2004. This phase-1 experiment is essential for evaluating RCMs over the North America region to examine the fidelity of these RCMs and the characteristics of model biases. All of the participating RCMs are run at spatial resolutions of about 50 km; the spatial resolution varies according to models as well as map projections used in the hindcast. The NARCCAP program archives the model data and provides them to the users for the assessment of the impact of climate change on regionally important sectors (Mearns et al. 2009). Details of the NARCCAP program are provided at the project web site (http://www.narccap.ucar.edu).
The 0.5°-resolution Climatic Research Unit (CRU) surface analysis is are used as the reference data against which the simulated surface air temperatures and precipitation are evaluated for the 24-yr period 1980–2003. The simulated surface insolation is evaluated against the Global Energy and Water Cycle Experiment (GEWEX) Surface Radiation Budget (SRB) data (Hinkelman et al. 2009; Stackhouse et al. 2011) for the 20 years from 1984 to 2003. Note that this insolation evaluation period is shorter than that for precipitation and temperature due to limited temporal coverage of the SRB dataset. The regional climate hindcast data from all RCMs are distributed in a common analysis grid at a 0.5° resolution (Fig. 4) by the NARCCAP program. Multiple subregions within the conterminous U.S. region (Fig. 4; Table 1) are selected to facilitate the stratification of model performances according to regions and, to some degree, distinct regional climate characteristics. For example, precipitation in the SWw region (all region abbreviations are expanded in Table 2; see Table 1 and Fig. 4) is directly affected by the North American monsoon system (e.g., Higgins et al. 1997; Kim 2002) and the winter precipitation in Pacific Northwest (PNw and PNe) and southern California show an opposite response to ENSO (Redmond and Koch 1991). Key findings from the evaluation study are summarized in the following sections.
a. Surface air temperature
The simulated annual mean surface air temperature climatology is evaluated against the CRU analysis. The most noticeable feature is the general warm bias in the Great Plains region (Fig. 5). This warm bias is found in all RCM results except CRCM. CRCM is an outlier among these six RCMs in the sense that it generates general cold biases over almost the entire conterminous U.S. region. On the contrary, HRM3 (Fig. 5d) generates overall warm biases. All RCMs generate warm biases in California's Central Valley and southern California and cold biases in the coastal regions, especially along the Atlantic Ocean and the Gulf of Mexico. These coastal cold biases are absent along the Pacific Ocean. Instead, all models except HRM3 underestimate the surface temperatures over the high elevation regions in the Cascades and the Sierra Nevada. The temperature bias over the Rocky Mountains is complicated, but there is a hint, when the bias pattern is compared with the terrain profile shown in Fig. 4, that cold (warm) biases tend to occur in the regions of relatively high (low) elevations. This may be related to the simulation of cold-season snowpack in the high-elevation regions (e.g., Waliser et al. 2011) and/or the lack of resolutions both in model simulations and the CRU data, suitable for representing the large orographic variations and associated variations in surface temperature in the mountainous region (e.g., Giorgi et al. 1997; Kim 2001; Kim and Lee 2003). Higher-resolution simulations may adequately resolve the temperature variations in the region.
Overall, all RCMs simulate the spatial variations in the annual mean temperatures in the conterminous U.S. region with the spatial pattern correlation coefficients between 0.95 and 0.99 and standardized deviations (the spatial standard deviation of the simulated surface air temperature normalized by that of the observed data) of 0.9–1.05 with respect to the CRU data (Fig. 6). Standard deviations and the pattern correlation coefficients in Fig. 6 are calculated over only the land surfaces. Figure 6 also shows that the multimodel ensemble mean (ENS in Fig. 5), along with MM5I, yield the smallest RMSE. Note that the distance between individual data points and the reference point (labeled REF in Fig. 5) in the Taylor diagram (Taylor 2001) represents RMSE.
Comparison of the simulated annual cycle against the CRU analysis for the subregions shows that the multimodel ensemble is generally within one standard deviation (σ) from the observed climatology in these regions (not shown). It also shows that the annual cycles simulated by individual models vary, but generally remain within the ±1σ range. Considering that these regions are spread over the entire conterminous United States, this suggests that the simulated annual cycle surface air temperature climatology in the NARCCAP hindcast experiment exhibits high model fidelity within the conterminous United States.
Despite the reasonable performances, model biases vary noticeably according to regions and seasons. Figure 7 presents the normalized biases and interannual variability in terms of the percentage of the temporal standard deviations of the CRU data over the 24-yr period, of the simulated surface air temperatures in the 14 subregions during winter and summer. The temporal standard variations are adopted as the measure of the interannual variability. The scaled model bias (Fig. 7a) shows that the warm bias over the Great Plains region is common for nearly all models in both winter and summer; CRCM, which generates weak (by 25%–50% of the observed interannual variability) cold biases over the region in winter, is the only exception. Thus, the warm bias in the Great Plains region is systematic regardless of models and seasons. The model biases also vary systematically according to regions. For summer, the most noticeable systematic biases are the warm bias in the western inland region including SWw, SWe, and Colorado (COL), and the cold bias in the PNw and PNe. Note that these biases are systematic only during summer. For winter, the most systematic biases are the warm bias in the southern California and Atlantic coast regions, especially the southeastern United States and Florida, and the cold bias in the northern California and Arizona–western New Mexico regions. The temperature bias in California varies closely with orography as shown in Fig. 5; this may not be well represented in this regional evaluation as the area representing the two California regions is large enough to include both low- and high-elevation regions. The evaluation of the temporal standard deviation, a surrogate for the interannual variability (Fig. 7b), shows that all models perform well in simulating the interannual variability of the winter temperatures for all subregions. Most RCMs also reasonably simulate the interannual variability of the summer temperatures in most of the coastal regions, but overestimate it in the interior regions.
The RCM skill in simulating the interannual variability of the seasonal temperature is further examined using RMSE and the temporal correlation coefficients between the simulated and CRU data over the 24-yr period. The resulting RMSE (Fig. 7c) generally exceeds the interannual variability of the CRU data (i.e., normalized RMSE >100% in Fig. 7c), especially during summer. For winter, the RMSE varies according to models in most regions; the normalized RMSE for CRCM and ECP2, as well as the multimodel ensembles, is less than 100% while that for MM5, HRM3, RCM, and WRF is well above 140% for all subregions. The correlation coefficients between the simulated and CRU time series (Fig. 7d) also shows that the RCMs examined in this study generally perform better in simulating the phase of the interannual variation in the surface air temperatures during winter than in summer.
In summary, the model performance measured in terms of bias is not well correlated with that measured in terms of interannual variability. Model performance also varies widely and, often systematically, according to regions and seasons. These characteristics in model errors (or performance) make it difficult to design a set of model weightings that can be universally applied to the construction of multimodel ensemble. Thus, it may be necessary to use season- and/or property-specific weighting factors in constructing multimodel ensemble for impact assessments at the expense of generality. Mearns et al. (2012a) discuss these biases on a seasonal time scale.
The bias in the simulated annual precipitation calculated against the CRU data (Fig. 8) shows that all five RCMs evaluated in this study systematically overestimate the annual-mean precipitation in the northwestern U.S. and the southwestern Texas regions, and underestimate it in the GC and GPs regions. Most RCMs also overestimate the annual precipitation in the Rocky Mountains region, but not as systematically and/or sizably as the wet bias in the eastern Washington and northern Idaho, to the east of the Cascades. It is also noted that all RCMs except CRCM underestimate precipitation in southern and central Arizona.
Evaluation of the spatial variability of the simulated annual precipitation (Fig. 9) shows that all models generate similar spatial correlations with the CRU data with the spatial correlation coefficients between 0.8 and 0.9 with the highest spatial correlation generated by the model ensemble (ENS in Fig. 9). The spatial variability of the simulated annual precipitation ranges between 0.75 and 1.1 with a majority of RCMs underestimating the spatial variability. As happens for the surface air temperatures, the multimodel ensemble yields the smallest RMSE. Note that in the Taylor diagram, the distance between REF and a model corresponds to RMSE (Gleckler et al. 2008b).
The simulated annual cycles of precipitation in the 14 subregions are compared to that from the CRU analysis (Fig. 10). Unlike the surface air temperature evaluated in the previous section, the intermodel variations are large. The multimodel ensemble (blue lines in Fig. 10) is generally within or close to the ±1σ range throughout the entire year in most subregions except for the inland part of the Pacific Northwest region (PNe), Colorado, and northern Great Plains. The multimodel ensemble overestimates the observed precipitation throughout the year in the Pacific coastal regions (PNw, CAn) except CAs where ENS are CRU are close, and the interior Pacific Northwest region. The underestimation of precipitation occurs throughout the year in GC; however, the bias is very small for the period from late spring (April) to late summer (August). In the interior regions (SWw, SWe, COL, GPn, GPc, GL), the multimodel ensemble overestimates (underestimates) precipitation roughly in the first (second) half of the year. For the Atlantic coast regions (NE, SE), the multimodel ensemble overestimates precipitation except in the period from late fall to early winter. Thus, as in the surface air temperature simulation, the model performance in simulating the precipitation annual cycle varies according to regions and seasons in addition to RCMs.
The multimodel ensemble precipitation annual cycle (Fig. 11) is evaluated using the RMSE normalized by the annual-mean value and the temporal correlation between the model ensemble and CRU data. The normalized RMSE (Fig. 11a) is relatively small in the eastern half of the conterminous U.S. and the Pacific coast regions and large in the interior western U.S. regions, especially for PNe where the RMSE is comparable to the annual means for all models. All or most models reasonably simulate the phase of the precipitation annual cycle in all regions except the Great Plains and GL regions. Figures 10 and 11 show that the multimodel ensemble precipitation annual cycle is generally more reliable in the coastal regions than in the interior regions.
The simulated seasonal precipitation in the subregions is evaluated against the CRU data using the normalized bias and interannual variability as the performance metrics. As with the seasonal surface air temperature evaluation in section 3a above, the model bias and interannual variability are normalized by the interannual variability of the CRU data. Figure 12 shows that the model biases vary substantially according to season and region. During summer, a majority of RCMs underestimate precipitation in the southwestern United States (CAs, SWw) and the two Great Plains regions, and overestimate in the Pacific coast (PNw, PNe), CL, and Atlantic coast regions. The bias in the SWw region implies that all models show difficulties in simulating the summer precipitation associated with the North American monsoon circulation, a great concern because the summer monsoon precipitation plays a crucial role in water resources and ecosystems for this arid region (e.g., Higgins et al. 1999; Kim 2002; Kim et al. 2005). M. Bukovsky et al. (2012, unpublished manuscript) examined the models' performance in the North American monsoon region from a process level point of view. Despite its large magnitude, the overestimation of summer rainfall in the Pacific coast regions may not be of a serious concern. Summer rainfall and its interannual variability in these regions is very small; thus, the relatively large wet bias does not translate into meaningful differences for most practical applications. For the GC and GL regions, model bias varies largely about the observed means according to models, resulting in small biases in the multimodel ensemble. Note that CRCM generates a large wet bias in the entire mountainous western United States in summer, but yields a relatively small bias in the eastern United States. In fact, the small bias in the multimodel ensemble for the interior of the western United States results from the dry biases in five of the six RCMs being compensated by the large wet bias in CRCM. The systematic bias in simulating the interannual variability of the summer rainfall are characterized by underestimations of the variability in both California regions and overestimations in the Pacific Northwest (PNw, PNe), Colorado, and Great Lakes regions (Fig. 12b). The multimodel ensemble agrees reasonably with the CRU data, with the bias less than 50% of the interannual variability of the CRU data, in 5 out of 14 regions.
The bias in the winter precipitation varies more systematically according to regions (Fig. 12a). A majority of RCMs show large wet bias accompanied by substantial overestimation of the interannual variability in the interior western U.S. (SWw, SWe, COL, GPn) and the northern Pacific coast (PNw, PNe, CAn) regions. All RCMs also underestimate the winter precipitation and its interannual variability in GC. Most RCMs perform well in simulating the winter precipitation in CAs, SE, and FL. Examinations of the interannual variability of the seasonal precipitation using RMSE and temporal correlation coefficients, as in the previous section for temperatures, also show that all RCMs generally perform better in simulating precipitation during winter than summer (not shown).
The accuracy of reference data can affect the outcome of model evaluations as all observations and/or analyses include errors of unknown and/or estimated magnitudes (Kim and Lee 2003; Kim et al. 2013b; Waliser et al. 1999). Uncertainties in model evaluation originating from reference data are examined for precipitation using five different reference datasets including the CRU, University of Delaware (UDEL; Willmott and Matsuura 1995; Matsuura and Willmott 2009), NCEP Climate Prediction Center (CPC) daily analysis (Higgins et al. 2000), Global Precipitation Climatology Project (GPCP; Adler et al. 2003), and Tropical Rainfall Measuring Mission (TRMM) data (Shige et al. 2006) for 1998–2003. The evaluation period for this multireference data evaluation is determined by the existence of all five reference datasets. In this evaluation, we first use the simple mean of all five observational data as the reference data and then evaluate all observations and model outputs against the reference data. The evaluation of the overland annual-mean precipitation is summarized in a Taylor diagram (Fig. 13). The area defined by the red lines encompasses all five observational data and may be regarded as the range of uncertainties for the observational data. Figure 13 shows that there exist notable variations among these observational data in representing the spatial variability and that none of the RCMs shows fidelity within the range of uncertainty defined by these observational datasets. In this multiobservation evaluation, the multimodel ensemble (ENS) also yields better performance than any other participating RCM.
It is interesting to note that, although the relationship between the biases in the seasonal precipitation and surface air temperatures appears to be weak, the summertime dry bias in SWw, GPn, and GPc corresponds to the warm bias in the same regions. This may show that the summer surface temperatures in these regions are closely related with the surface latent heat flux that is strongly affected by precipitation. This will be examined in future studies in which surface fluxes and soil moisture will be included in RCMES and the evaluation. Mearns et al. (2012b) provide more discussion on the seasonal precipitation bias.
c. Surface insolation
Surface insolation is the primary energy input that drives the land–atmosphere interaction in the regional climate system. Hence, the accuracy of the surface insolation is a basis for accurately simulating surface fluxes and hydrology. Evaluating surface radiation has been suffering from the lack of proper observational data until late 1990s (e.g., Kim and Lee 2003). Recent progress in remote sensing products made a suite of quality controlled radiation data available to significantly alleviate the problem. The simulated surface insolation data are evaluated against the SRB data (Hinkelman et al. 2009; Stackhouse et al. 2011) for the 20 years from 1984 to 2003. The evaluation period is dictated by the period for which the SRB data exist. Five RCMs (CRCM, ECP2, HRM3, RCM3, and WRF) and their simple ensemble (ENS) are evaluated.
Figure 14 shows that the model bias in simulating surface insolation varies substantially according to RCMs. All but RCM3 show positive bias over land surfaces (Table 3). It is also noteworthy that all RCMs but WRF show more positive (or less negative) biases over the eastern half of the conterminous United States than the western half. The model errors in the annual-mean surface insolation in Fig. 14 are closely related to those of the annual-mean precipitation with negative spatial pattern correlation coefficients between the model biases in precipitation and insolation for all RCMs and their ensemble (Table 3). This is not surprising as excessive precipitation is related with excessive cloudiness (or cloud water path) that tends to reduce surface insolation, and vice versa.
Model performance in simulating the spatial variability is visualized using the Taylor diagram (Fig. 15). The spatial pattern correlation coefficients between the simulated insolation and the SRB data range from 0.8 to 0.95, similar to those for the annual-mean precipitation (Fig. 9), but somewhat lower than for surface air temperatures (Fig. 6). Figure 15 also shows that most RCMs overestimate the spatial variability compared to the SRB data. As for precipitation and surface air temperature, the multimodel ensemble mean (ENS in Fig. 15) yields smaller RMSE than all RCMs included in the multimodel ensemble mean.
The annual cycle of the simulated surface insolation in the selected regions shows that the multimodel ensemble mean is generally within the ±1σ range from the SRB climatology (not shown). As in the temperature and precipitation evaluations, σ is the temporal standard deviation of the SRB data calculated over the 20-yr period. The intermodel variations are larger than those in the surface air temperature but smaller than those in precipitation (Fig. 10). The most noticeable discrepancy between the multimodel ensemble and the SRB climatology occurs from late spring to early summer (May–July) in the inland regions including the two Great Plains regions and the Great Lakes region, and in the Gulf of Mexico and NE and NS where the model ensemble is close to the upper 1σ boundary of the SRB data in the May–July period.
The simulated surface insolation in summer and winter is evaluated in terms of model biases (Fig. 16a) and interannual variability (Fig. 16b); the model biases and variability are normalized by the interannual variability of the SRB data. Compared to surface temperatures and precipitation, the model biases in surface insolation vary more widely according to models. For example, RCM3 underestimates the seasonal-mean surface insolation and overestimates their interannual variability in all regions for both summer and winter, whereas WRFG overestimates the seasonal-mean insolation in most regions, especially in summer when the model overestimates the surface insolation by more than twice its interannual variability in all regions except the Florida region (Fig. 16a). WRFG performs well in simulating the wintertime insolation in PNw, PNe, CAn, and CAs but substantially overestimates it in all other regions except FL. Both CRCM and ECP2 perform well in simulating interannual variability in all regions (Fig. 16b). The model ensemble mean shows positive biases in most regions except PNw, PNe, and CAn during summer, and negative bias in the western United States in winter. The model ensemble also overestimates the interannual variability in all regions in summer, but performs well for winter. Regionally, three out of four RCMs overestimate insolation in the eastern half of the conterminous U.S. region during summer. The only exception is RCM3, which underestimates insolation for all regions in both seasons. In winter, most models underestimate insolation in the Pacific coast and FL.
In some regions, the model bias in surface insolation is negatively correlated with that in precipitation. Such a relationship occurs most clearly in the Pacific coast region in winter where most RCMs show negative bias in surface insolation (Fig. 16a) and positive bias in precipitation (Fig. 12a). This may be due to the fact that precipitation in the region is mostly from stratiform clouds that reduce insolation over large areas for extended periods. The negative correlation between the surface insolation bias and precipitation bias also occurs in the Pacific Northwest region in summer; however, the significance of this summertime relationship is not clear as the summer rainfall in the region is very small.
4. Discussions and conclusions
The regional climate model errors in precipitation, surface air temperature, and surface insolation simulated by the RCMs that have contributed to the NARCCAP hindcast experiment have been evaluated against surface- and satellite-based observational data. Because RCMs play a crucial role in generating regional climate data for assessing the impact of climate change on regional sectors by downscaling GCM data, evaluating the performance of RCMs is a critical step in developing objective policy and strategies to mitigate and adapt to the impact of anticipated climate change. To facilitate model evaluations, RCMES has been developed by combining an efficient database schema and a computational toolkit. Evaluations of these RCM results show that all RCMs simulate the spatial pattern and variability of the observed annual-mean climatology of these three fields reasonably. The multimodel ensemble shows the best performance for all metrics and variables examined in this study, as in Gleckler et al. (2008b). Note that the reason behind such improvement has not been established and that the multimodel ensemble generally underestimates variability compared to the individual models within the ensemble due to smoothing effects. (e.g., Kim et al. 2013b). It also shows that the model performance varies systematically, in addition to RCMs, for seasons, regions, and metrics.
It has been found that some errors in simulating these variables occur systematically across all these RCMs. The most noticeable systematic errors in the annual-mean surface air temperatures are the warm biases in the Great Plains and the cold bias in the Atlantic and Gulf of Mexico coasts. Model biases in the mountainous regions may vary according to terrain elevations suggesting the need for fine resolutions to simulate the temperature variations in mountainous regions. All models well simulate the spatial pattern and variability of the simulated annual-mean temperature as well as the annual cycles in selected regions compared to the CRU data. The warm bias over the Great Plains occurs in both summer and winter. For summer, the most notable errors are the warm bias in the interior western United States and the cold bias in the Pacific Northwest. For winter, the most outstanding RCM errors include the warm bias in the Atlantic coast and Florida regions and cold bias in northern California and Arizona–western New Mexico.
The most notable common errors in simulating the annual precipitation is the wet bias in the mountainous northwestern United States and dry bias in GC and the southern Great Plains. Spatial variations in the annual precipitation and the annual cycles in selected regions are also well simulated by all models although not as highly as for surface temperatures. All models generally perform better for the eastern half of the conterminous United States than the western half. The regions of small RMSE do not always correspond to those of large correlation coefficients. In summer, most RCMs underestimate precipitation in CAs, Arizona, New Mexico, the Great Plains, and western Texas, and overestimate in all three coastal regions. All RCMs show especially poor performance in simulating the summer monsoon rainfall in the Arizona–western New Mexico region, an important concern to the region. M. Bukovsky et al. (2012, unpublished manuscript) discussed the process-level causes for these biases in the NAMS region. Models generally overestimate the mean and interannual variability of winter precipitation in the northern Pacific coast and interior western U.S. regions, but perform well for CAs, SE, and FL.
The model bias in surface insolation varies widely according to RCMs. All but one model overestimate surface insolation over the conterminous United States. Most models generate larger positive bias (or smaller negative bias) over the eastern half of the conterminous United States than the western half. Overall, the annual insolation bias is negatively correlated with the annual precipitation bias (i.e., a negative bias in surface insolation corresponds to a positive bias in precipitation, and vice versa). Seasonally, the relationship between the model biases in surface insolation and precipitation is clearly identified only in the Pacific coast regions, especially in winter, when precipitation is associated with large-scale stratiform clouds. This also suggests that the bias correlation between precipitation and insolation is weaker in the case where convective precipitation is dominant compared to the case where stratiform precipitation is dominant.
Significant variations in model errors according to seasons and regions may point out common problems in physics parameterizations used in today's RCMs, model resolutions, and/or large-scale forcing due to inaccurate reanalysis data. Thus examinations of related model variables are needed to elucidate the detailed error characteristics and, hopefully, their origins. An example of this is the warm temperature and dry precipitation bias over the plains regions during the summer. It is thought that this has much to do with the ability to represent the diurnal cycle in summer precipitation in this region that arises from relatively small-scale propagating convective systems (Jiang et al. 2006). This implies that to obtain the most accurate climate change assessments possible, we need to improve model physics, which in part will arise from high-resolution simulations (e.g., 1–5 km).
The error characteristics also imply that bias corrections, a key step in applying climate projection data to drive a wide range of impact assessment models, may need to be performed for specific variables and regions. Please note that bias correction is highly subjective and its nature and consequences on transferring climate model information to assessment models remain as a research topic. That is, universal bias correction that can work for multiple variables and/or larger areas like the entire conterminous U.S. region may not be feasible. This requires that climate modeling and impact assessment communities work closely to answer the questions in applying climate projections to impact assessments, a key step for developing policies to mitigate and adapt to future climate change.
With the anticipated improvements in computational resources that will enable more fine-resolution climate simulations, future RCM evaluation will require high-quality finescale reference data. For example, regional downscaling using RCMs for extended periods (>10 seasons) with spatial resolutions of a few kilometers is feasible now (e.g., Kim et al. 2013a), albeit with a considerable commitment of computer resources. Such fine spatial resolutions are key to improving simulations in regions of complex terrain, especially surface fields such as precipitation and mountain snowpack that are a critical concern for water supply in the western states. Presently, fine-resolution (<10 km) observations exist only for limited durations, variables, and geography. Thus, future efforts need to be directed to finding fine-resolution data, especially from spaceborne remote sensing, and developing methodologies for applying those data to model evaluations. Establishing the uncertainties in observations will become more important with higher resolution.
We thank Dr. Guan for making the Taylor diagrams. This study was supported by the American Recovery and Re-investment Act (ARRA), NASA NCA (ID 11-NCA11-0028), NASA AIST (AIST-QRS-12-0002), NSF ExArch (ID 1125798), and NSF EaSM (ID 2011-67004-30224) projects. The North American Regional Climate Change Assessment Program (NARCCAP) is funded by NSF, the U.S. Department of Energy, NOAA, and the U.S. EPA Office of Research and Development. The contribution from D. E. Waliser, C. Mattmann, C. Goodale, A. Hart, P. Zimdars, D. Crichton, H. Kyo, P. Loikith, and M. Boustani to this study was performed on behalf of the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.