The U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program User Facility produces ground-based long-term continuous unique measurements for atmospheric state, precipitation, turbulent fluxes, radiation, aerosol, cloud, and the land surface, which are collected at multiple sites. These comprehensive datasets have been widely used to calibrate climate models and are proven to be invaluable for climate model development and improvement. This article introduces an evaluation package to facilitate the use of ground-based ARM measurements in climate model evaluation. The ARM data-oriented metrics and diagnostics package (ARM-DIAGS) includes both ARM observational datasets and a Python-based analysis toolkit for computation and visualization. The observational datasets are compiled from multiple ARM data products and specifically tailored for use in climate model evaluation. In addition, ARM-DIAGS also includes simulation data from models participating the Coupled Model Intercomparison Project (CMIP), which will allow climate-modeling groups to compare a new, candidate version of their model to existing CMIP models. The analysis toolkit is designed to make the metrics and diagnostics quickly available to the model developers.
A set of standard metrics and diagnostics provides an effective way for climate modeling centers to routinely assess their model performance and judge the improvement of model simulations from new parameterizations. In the past, climate model developers have often relied on satellite remote sensing products to calibrate and tune their models. Satellite datasets provide great global coverage; however, it is difficult to apply satellite data in some process studies due to their poor temporal resolution. Therefore, utilizing detailed high-frequency ground-based measurements for a comprehensive collection of quantities can be a complementary test in model evaluation.
Over the past three decades, the U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program has established several permanent research sites and deployed a number of ARM Mobile Facilities (AMF) in diverse climate regimes around the world to collect long-term continuous field measurements of clouds, aerosols, and radiation and their associated large-scale environments. These detailed field observations have provided a unique observational basis specifically for understanding cloud and precipitation related processes and evaluating and improving their representations in climate models. However, ARM data have not been extensively utilized in current model development workflows. With the growing interest in the climate modeling community in developing process-oriented metrics and diagnostics to aid parameterization development (Maloney et al. 2019), the high-frequency process-oriented ARM observations should play a more important role in future metrics and diagnostics development.
In this article, we introduce the recently developed ARM data-oriented metrics and diagnostics package (ARM-DIAGS) for the global climate community to facilitate the use of ARM field data in climate model evaluation. The focus is on unique ARM observations on clouds and aerosols, as well as process-oriented diagnostics that are particularly aimed to improve the representation of cloud and precipitation related processes in climate models, such as those included in the Coupled Model Intercomparison Project (CMIP). The package is available publicly with the hope that it can serve as an easy entry point for climate modelers to compare their models with ARM data and supplemented CMIP datasets.
Overview of the ARM data-oriented metrics and diagnostics package
The ARM-DIAGS development closely follows the CMIP protocol to efficiently distribute ARM metrics and diagnostics package along with other metrics packages to the CMIP community and other climate modeling centers. For this purpose, the diagnostic toolkit is built with the Python programming language and utilizes Python libraries for scientific analysis (such as NumPy and matplotlib). Additional Python packages developed by DOE [i.e., the Community Data Analysis Tools (CDAT), https://cdat.llnl.gov/] are also used. Four components are currently included in the ARM-DIAGS: 1) a Python-based analysis program; 2) an ARM-based collection of mean and diurnal and seasonal cycle climatologies as well as high time frequency data for process-oriented diagnostics; 3) a database of simulation data from models contributed to the CMIP project; and 4) relevant technical documentations for ARM-DIAGS.
The observations used to assess model performance primarily rely on the ARM Best Estimate (ARMBE) data products (Xie et al. 2010) and other ARM value-added products (VAPs; www.arm.gov/capabilities/vaps), which are available for all the ARM permanent research sites and some ARM mobile facilities. These data often rely on measurements at the ARM Central Facility (CF) locations (i.e., single point measurements). To improve model–observation comparison, the ARM long-term continuous forcing data (Xie et al. 2004), which represents an average over a global climate model (GCM) grid box, is also used when it is available. For cloud properties such as cloud liquid and ice water contents, the ARM Cloud Retrieval Ensemble Data (ACRED; Zhao et al. 2012) is used. The detailed information about ARM data used in the ARM-DIAGS package is listed in Tables 1 and 2. The observational data product consists of hourly averaged, diurnal cycle, monthly means or climatological summaries of the measured quantities, with variable names, units, and vertical dimensions remapped to CMIP convention. They are currently available for the Southern Great Plains (SGP) site (Table 1) as well as the North Slope of Alaska (NSA) Barrow (now known as Utqiaġvik) site and the Tropical Western Pacific (TWP) Manus, Nauru, and Darwin sites (Table 2). Other than the ARM observations, ARM-DIAGS also includes simulation data from models participating in the CMIP project, which will allow climate-modeling groups to compare a new candidate version of their model to existing CMIP models. A full list of metrics and diagnostics are as follows, with a subset demonstrated in the “Facilitating use of ARM data in climate model evaluation” section of this article:
a set of basic metrics tables: mean, mean bias, correlation, and root-mean-square error based on annual cycle of each variable;
line plots and Taylor diagrams (Taylor 2001) for annual cycle variability of each variable;
contour and vertical profiles of annual cycle and diurnal cycle of cloud fraction;
line and harmonic dial plots (Covey et al. 2016) of diurnal cycle of precipitation;
probability density function (PDF) plots of precipitation rate (Pendergrass and Hartmann 2014); and
convection onset metrics showing the statistical relationship between precipitation rate and column water vapor (Schiro et al. 2016).
Facilitating use of ARM data in climate model evaluation
Diagnosis of summertime warm bias.
The data and diagnostics provided through ARM-DIAGS have been used for studying the systematic warm bias in surface temperature found among the climate models in summertime over continental midlatitudes including the ARM SGP site (C. Zhang et al. 2018). The biases are consistent with both overestimated surface shortwave radiation and underestimated evaporative fraction, which contribute to the warm bias as illustrated in Fig. 1. These diagnostics provide an integrated picture with detailed field observations to identify possible model deficiencies in representing cloud, radiation, and land properties, as well as their interactions.
Diurnal cycle of cloud fraction.
This daily cycle could serve as a critical test of the models’ representation of the physical processes controlling cloud life cycle. One unique product from ARM is cloud vertical profile measurements derived from an integration of multiple active remote sensors, including millimeter wavelength cloud radars, laser ceilometers, and micropulse lidars [Active Remote Sensing of Clouds product (ARSCL)]. Figure 2 shows a comparison between observed and simulated diurnal cycle of cloud vertical structure over the ARM midlatitude and tropical sites (i.e., SGP and Manus), where prominent climatological diurnal cycle of clouds is present. Over the SGP site, a lack of cloud transition from shallow to deep during summertime [June–August (JJA)] is shown in the Energy Exascale Earth System Model (E3SM). This is a common model bias which is related to model deep convection that is triggered too easily and does not allow low clouds to build up. The Manus site exhibits a strong diurnal cycle, with a maximum in low cloud fraction occurring at early local noon and followed by a maximum in high cloud hours later. Similarly, the model in general underestimated the lower cloud and overestimated high cloud, which is also lack of diurnal variability.
Diurnal cycle of precipitation.
Diurnal cycle of precipitation often serves as a benchmark for climate models. The diurnal cycle diagnostics in ARM-DIAGS, which compare the precipitation intensity and its peak time, have been utilized by the E3SM development team to assess the performance of a newly developed convection triggering mechanism (Xie et al. 2019). Figure 3 shows that all climate models including the default E3SM are not able to capture the observed nocturnal peak which is often associated with the eastward propagation of mesoscale convective systems. A recently developed convective triggering function, which incorporates an empirical dynamic constraint and allows elevated convection to be captured, started to pick up the early morning peak time, although the intensity is still too weak. These diagnostics are useful to repeat continually, especially when new features in convection parameterizations are implemented.
The PDF analysis for daily mean precipitation at the SGP site during June–August is shown in Fig. 4. This example illustrates that models tend to underestimate heavy rainfall (>10 mm day‒1) both in frequency (Fig. 4a) and the amount contributed to the mean precipitation (Fig. 4b). The overlaying result from GPCP (Global Precipitation Climatology Project One-Degree Daily Precipitation Dataset) also confirms this systematic model bias.
Convection onset metrics.
Convection onset metrics allow users to compare diagnostics for the behavior of deep convection from ARM observations to model output. The statistics quantify robust relationships between precipitation, column water vapor (CWV), and temperature. This includes the sharp increase or “pickup” in conditional-average precipitation rate above a critical CWV value seen in Fig. 5a, which is easily identifiable for short time averages at tropical ARM sites. The pickup represents the onset of conditional instability yielding strong convective precipitation (Schiro et al. 2016) and is also seen in the probability of precipitation (Fig. 5b). The probability density of CWV and the contribution from precipitating points (Fig. 5c) have a drop in probability density at high CWV corresponding to the regime with high precipitation loss above the critical CWV value. These features are robust to spatial averaging up to about 2° latitude–longitude and time averaging up to about 3 h (Kuo et al. 2018), aside from slight increases in probability (Fig. 5b) with averaging.
The statistics discussed here can distinguish between convective parameterizations in models (Kuo et al. 2020). An example of model comparison is given in Fig. 5. An important diagnostic in the model evaluation of convection onset concerns the critical CWV value where the precipitation pickup begins. Many models exhibit a pickup at lower CWV than observations (Kuo et al. 2020), as seen in Fig. 5a for E3SM. This mismatch persists even when temperature dependence (not shown but will be included in a future release of ARM-DIAGS) is included by binning by the saturation water vapor.
Summary and future work
The ARM metrics and diagnostics package is designed and developed to facilitate the use of ARM ground-based in situ measurements in climate model evaluation. Metrics and diagnostics evaluating the simulated atmospheric and cloud fields are generated by running a Python program in a simple software environment based on CDAT. The v2.0 ARM-DIAGS’s analysis codes are currently publicly available through GitHub (https://github.com/ARM-DOE/arm-gcm-diagnostics) under the ARM User Facility project space. This analysis code package is envisioned to serve as a central place to share the valuable analysis scripts to produce the metrics and diagnostics developed based on ARM data from the community. Analysis data include ARM observational datasets and the reference CMIP5 AMIP data can be downloaded through the ARM archive (www.arm.gov/capabilities/vaps/adcme-123). For now, the default requirement for the input model is that the data use CMIP conventions. Anyone interested in applying ARM-DIAGS to a specific model should contact the development team via our GitHub page for specific configurations for a model run.
Future work includes extending the ARM-DIAGS to the ARM Eastern North Atlantic (ENA) site (a new fixed site) and ARM AMF sites. CMIP6 data will be included as they become more available. Ongoing work includes incorporation of the recently developed ARM cloud radar simulator (Y. Zhang et al. 2018) into ARM-DIAGS to improve the comparison between model clouds and ARM cloud radar observations, as well as adding temperature dependence to convection onset statistics. In addition, utilizing other sources of observations, such as those retrieved from satellites, as supplementary data, can help address issues associated with observation uncertainty and data resolution. Moving forward, we will be particularly focusing on adding process-oriented diagnostics in ARM-DIAGS. The diagnostics suite will be continuously improved with close collaboration with scientists in the field. To make this package to be accessible and utilized broadly, we plan to integrate it into other commonly used Python-based metrics packages in the GCM community such as the PCMDI’s metrics package (PMP) and the DOE E3SM diagnostics package (E3SM-DIAGS) to provide routine model evaluation at ARM sites.
This research was supported by the DOE Atmospheric Radiation Measurement (ARM) program and performed under the auspices of the U. S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344. IM release: LLNL-ABS-789643. Work at UCLA was supported by U.S. Department of Energy Grant DE-SC0011074 and subcontract B634021 and National Science Foundation AGS-1540518, AGS-1936810. We acknowledge the U.S. Department of Energy's Program for Climate Model Diagnosis and Intercomparison to provide coordinating support and lead development of software infrastructure in partnership with the Global Organization for Earth System Science Portals for making CMIP data available.
CURRENT AFFILIATION: Google LLC, Mountain View, California