## 1. Introduction

General circulation models (GCMs) employed in climate projections are the tool of choice when quantifying the anthropogenic influence on Earth's climate, ultimately answering the question as to the degree to which humanity has an influence on global-mean surface temperature. Over the past decades, GCMs have undergone considerable development, manifested in an ever larger increase in complexity and resolution. However, uncertainty in climate sensitivity has not been substantially reduced since its ad hoc introduction by Charney et al. (1979) and major atmospheric processes are still subject to considerable uncertainties. Of these, atmospheric convection and the clouds and feedbacks associated with it are most probably the most uncertain in the latest generation of GCMs (Randall et al. 2007). This is not only true for the multimodel ensemble of the phase 3 of the Coupled Model Intercomparison Project (CMIP3; Meehl et al. 2007), but model parameters associated with convection are often the most sensitive in perturbed parameter ensembles (Murphy et al. 2004; Klocke et al. 2011).

Uncertainties in the representation of convection in current-generation GCMs not only lead to uncertainties in estimates of climate sensitivity, but also manifest themselves in an erroneous simulation of precipitation. Generally, GCMs are capable of capturing the overall amount of precipitation well, but the spatial distribution and variance often compare poorly to observations (e.g., Dai 2006; Pincus et al. 2008). Because of the limited spatial resolution of a GCM, atmospheric convection is of subgrid-scale nature and can thus not be explicitly resolved and must be parameterized. Since the emergence of the first convection parameterization techniques some four decades ago, the response of convective elements to a given large-scale atmospheric state has mostly been formulated as purely deterministic [see Arakawa (2004) for a review], which implicitly prevents a particular model integration from developing convective variability beyond that given by the atmospheric state at the gridpoint level.

It is just in the last decade that a possible solution to this lack of variability in parameterized subgrid-scale processes has emerged. This solution is based on representing the variability in the response of unresolved processes to the large-scale environment in a dynamically stochastic rather than in a purely deterministic manner (Palmer 2001) and has been shown to increase predictive skill of numerical weather prediction (i.e., Buizza et al. 1999).

Specifically targeted toward improving the representation of convection, Lin and Neelin (2000, 2003) introduced random perturbations to convective available potential energy (CAPE) or the heating profile of the host convective scheme and found that even such a simple approach significantly enhanced precipitation variance toward that of observations. Randomly perturbing the trigger function of the Kain–Fritsch convection scheme also proved to yield an increase in predictive skill (Bright and Mullen 2002). Teixeira and Reynolds (2008) randomly sampled convective-parameterization-relevant variables from a subgrid-scale distribution and found an increase in the spread of an ensemble prediction system and in particular a better representation of tropical convection. A similarly simple approach was taken by Tompkins and Berner (2008), who randomly sampled a subgrid-scale relative humidity distribution to perturb a convective parcel's initial humidity and/or the humidity of the entrained air during ascent. Although promising results were obtained for midlatitudes, the methodology employed did not yield improvements in tropical convection. In all the studies mentioned above, the randomly sampled deviations were assumed proportional to the mean of the perturbed variable, an assumption shown to be valid when using cloud-resolving model data as a surrogate for observations (Shutts and Palmer 2007).

Taking a step further from just modifying the input parameters for existing convective parameterization closures and cloud models, several recent studies focused on formulating more advanced stochastic schemes. Majda and Khouider (2002) introduced a stochastic parameterization of convective inhibition (CIN) based on the Ising model of statistical mechanics. It is coarse grained to obtain a Markov birth–death process, which is two-way coupled to the large-scale dynamics and which can be integrated with very little computational overhead (Khouider et al. 2003). The stochastic CIN model is used in Khouider et al. (2003) and in Majda et al. (2008) to improve the wave variability and climate in an otherwise deficient mass flux–like parameterization in the context of a simple 1.5-layer toy GCM. Plant and Craig (2008) calculated a distribution of convective plumes and then randomly sampled this distribution to obtain a plume ensemble that matches a required gridbox-mean mass flux given by a CAPE closure. Testing in a single-column model environment yielded high variability for small grid boxes, approaching the deterministic limit with increasing gridbox size. Recently, this scheme was tested in a limited-area model ensemble over central Europe and results showed a promising increase in precipitation variance (Groenemeijer and Craig 2012). Although not concentrating on deep convection, the study of Dorrestijn et al. (2013) represents a notable approach to stochastic parameterization of shallow cumulus convection. They applied a Markov chain method to sample pairs of turbulent heat and moisture fluxes obtained from large-eddy simulations (LESs) and found a good agreement in the calculated ensemble spread compared to the LES data. Following the coarse-graining ideas used in Khouider et al. (2003), Khouider et al. (2010, hereafter KBM10) designed the stochastic multicloud model (SMCM) based on a birth–death process to represent tropical convection. The SMCM calculates the evolution of a cloud population consisting of three cloud types associated with tropical convection (congestus, deep convection, stratiform) constrained by the large-scale atmospheric state. The state of the cloud ensemble at any given time and large-scale forcing is represented by area fractions per cloud type on a subgrid-scale lattice. The SMCM was shown to reasonably simulate tropical convection and associated wave features when coupled to a simple two-layer atmospheric model [KBM10; Frenkel et al. 2012 (hereafter FMK12), 2013].

As the vast majority of today's GCM convection schemes are mass flux schemes, the cloud area fractions simulated by the SMCM could prove valuable for introducing a stochastic component to such schemes. Then at least one part (area) of the cloud-base mass flux would yield a stochastic component, leaving the other part (updraft velocity) to be assigned in another suitable fashion.

It is the aim of this study to provide an assessment of whether the underlying framework of the SMCM is suitable to reproduce observed convective behavior. In doing so, we analyze observed convective behavior and subsequently adjust the model parameters, which have so far been based on sensible empirical assumptions (KBM10), to match the observed mean response of convection to the large-scale state. We then use the resulting adjusted model to test whether its underlying framework is suitable to reproduce the statistical mean behavior of observed convection, the positive outcome of which would render the SMCM a useful tool for convection parameterization.

The observational dataset that we use in this study is described in Davies et al. (2013) and represents a long-term, large-scale dataset for three consecutive wet seasons over Darwin, Australia, complemented by an identically derived but shorter dataset representative for Kwajalein. The Darwin dataset has been shown to contain valuable information for characterizing relationships between atmospheric convection and the large-scale state, with one of the most notable findings being that the relationships between convection and CAPE or vertical velocity are shown to be entirely stochastic or quasi deterministic, respectively (Davies et al. 2013).

We introduce the basics of the SMCM, the observational dataset, and the observation-derived forcing for the SMCM in section 2 and present the statistical relationships of observed convection to large-scale variables in section 3. We then adjust the parameters of the SMCM, force it with the observed large-scale state, and analyze the statistics of the modeled convection as well as the stochasticity of the model solution in section 4. Section 5 gives a summary, conclusions, and a short outlook.

## 2. Prerequisites: The model and the observations

In this study, we utilize the recently introduced SMCM (KBM10) in conjunction with a large-scale observational dataset representative of a tropical location. In a nutshell, we investigate the degree to which the mathematical framework of the SMCM is suitable to reproduce the behavior of observed tropical convection, a necessary step toward a possible future usage in GCMs. In the following, we briefly introduce the SMCM (section 2a) and the observational dataset (section 2b).

### a. The SMCM: A short introduction

Given the temporal evolution of a large-scale atmospheric state representative of a tropical location, the SMCM simulates the evolution of an ensemble of three cloud types associated with tropical convection on a lattice containing *n* × *n* sites. The considered cloud types are congestus and deep convective as well as stratiform clouds (shallow convection is not considered) and the large-scale atmospheric state is given by two variables: one representing a proxy for convective propensity and the other representing a proxy for midtropospheric dryness (cf. section 2c). In the SMCM, the evolution of the cloud ensemble is represented by a coarse-grained birth–death process. This process is evolved in time by means of an acceptance–rejection Markov chain Monte Carlo method based on Gillespie's exact algorithm (Gillespie 1975; see KBM10 for details of the implementation). Each individual lattice site can take one of four states: clear sky, congestus cloud, deep convective cloud, or stratiform cloud. The total size of this lattice, say 20 × 20 sites, is assumed as being representative of a GCM grid box, but there is no explicit spatial scale associated with either the individual lattice sites or the total lattice. There is also no spatial coherence between individual lattice sites (i.e., the temporal evolution at one site is completely independent of that of its neighbors). However, local interactions between lattice sites can be easily incorporated, provided the strength and nature of these interactions are understood (Khouider 2013).

The evolution of this birth–death process is determined by a set of equations that define transition rates from one of the four states (see above) to another. Individual transition rates can, but need not, be dependent on the given large-scale state and their formulation is mainly inspired by physical intuition and based on specific rules; for example, a deep convective cloud is not allowed to form from a stratiform cloud (see KBM10 for details). The individual transition rates are associated with time scales assumed to be representative for a specific transition. These transition time scales have been chosen in an ad hoc but physically meaningful manner and represent the only parameters that can be used to tune the SMCM in its current formulation. KBM10 presented two sets of transition time scales, both of which are based on physical intuition gained from observations and modeling studies of tropical convection and should be considered as rough estimates. Recently, FMK12 found a third set of transition rates that improves the intermittency of simulated convection compared to the results of KBM10. In this study, we use observations to take a closer look at these previously made choices of transition time scales.

So far, the SMCM has not been used in combination with observations but was coupled to a simple two-layer atmospheric model capable of capturing the main characteristics of tropical convection and associated wave features (Khouider and Majda 2006, 2008a,b; KBM10). There, simple formulations of precipitation formation and the associated heating profiles accounted for the feedback to the dynamics. Recently, FMK12 used the SMCM to explore its capabilities in the context of improving GCM convection parameterizations by using the abovementioned two-layer model to flows about an equatorial ring. They found that using the SMCM increases the variability of tropical convection compared to a deterministic convection parameterization and that the SMCM is able to produce a realistic Walker cell circulation when forced with a longitudinal SST gradient.

One may argue that the capability of the SMCM to produce sensible results is given by its design principles (e.g., prescribing certain transition time scales, assuming tropical convection to be dependent on two predictors only, or coupling it to a simple two-layer atmospheric model). In fact, a comparison of the SMCM-simulated cloud area fractions to observational data is still outstanding. It is the aim of this study to use the SMCM in a diagnostic fashion by forcing it with an observed large-scale state to investigate the feasibility of using its underlying stochastic concept for convective parameterizations in full GCMs.

### b. Two datasets of observed large-scale atmospheric state over tropical areas

We utilize two datasets comprising various quantities describing the large-scale atmospheric state over a tropical location for the purpose of this study. One dataset covers an approximately 190 × 190 km^{2} pentagon-shaped area centered over Darwin, Australia (Davies et al. 2013), investigated during the Tropical Warm Pool-International Cloud Experiment (TWP-ICE; May et al. 2008). The size of the area is chosen to approximately represent that of a typical GCM grid box and the gridbox-mean values of atmospheric variables are computed using a variational analysis after Zhang and Lin (1997). This variational analysis is applied to a large part of three consecutive wet seasons (2004/05, 2005/06, 2006/07). Over northern Australia, the wet season is defined as the time period between September of one year and April of the following year. The dataset and its documentation can be obtained via the Atmospheric Radiation Measurement (ARM) Climate Research Facility's website (http://www.arm.gov/data/pi/46) and we use all available data for the analysis presented here. Atmospheric variables are available every 6 h. Information on clouds and precipitation is retrieved from radar observations by the C-band polarimetric (CPOL) research radar (Keenan et al. 1998) located at Gunn Point and operated by the Australian Bureau of Meteorology. From those data, rain area fractions attributable to either stratiform or convective precipitation are determined after Steiner et al. (1995) and used as a proxy for stratiform and convective cloud fractions (Kumar et al. 2013b). Convective clouds are separated into congestus and deep convection according to cloud-top height (CTH): convective clouds having CTHs of less than 7 km are classified as congestus whereas clouds having higher CTHs are classified as deep convective clouds (Kumar et al. 2013a). The dataset encompasses the period of TWP-ICE (May et al. 2008), which took place in the same area during January and February 2006. The collected data of meteorological regimes encountered during TWP-ICE have already proven to be very valuable for the evaluation of GCM convective parameterizations (e.g., Lin et al. 2012).

The second dataset represents the large-scale atmospheric state over Kwajalein and is obtained by applying the same variational analysis as is used for the Darwin dataset. Convective- and stratiform-precipitation area fractions are also calculated according to Steiner et al. (1995); however, congestus area fractions are not available because the radar data available to us only consist of horizontal 2D scans. The Kwajalein dataset covers a shorter time period (May 2008–January 2009) and was produced to match the observation intensive period of the Year of Tropical Convection (YOTC; Waliser and Moncrieff 2007) project. For better comparability, the Kwajalein data are derived for an area identical to the pentagon-shaped one over Darwin.

We use both datasets in this study to show that the functional dependency of tropical convection on a given large-scale atmospheric state is similar for both locations although they are subject to distinctly different boundary conditions (e.g., land–sea distribution or monsoonal forcing).

To illustrate the multitude of meteorological regimes found in the datasets, we show the time series of selected atmospheric parameters for the time period of 10 November 2005–18 April 2006 over Darwin in Fig. 1. It is evident that apart from the variability during the TWP-ICE period (19 January–28 February 2006; May et al. 2008), the snapshot shown in Fig. 1 alone contains a number of evident meteorological regime changes that result in distinctly different cloud populations. Characterizing the middle-troposphere level, the time series of relative humidity qualitatively exemplifies “wet” periods around 20 January or 1 April 2006 (among others) and “dry” periods around 25 November 2005 or 1 March 2006 (among others) of the time series. As shown in the plot of derived convective and stratiform cloud fractions, the abovementioned wet and dry periods are each associated with specific cloud regimes: the wet periods are generally associated with higher cloud fractions compared to the dry periods. Stratiform clouds exhibit the highest cloud area fractions, with deep convective cloud fraction being about an order of magnitude less and congestus cloud fraction being again an order of magnitude less than that. It must be noted that the derived cloud area fractions are representative for precipitating clouds only. However, this does not present a serious issue; that is, fractions of tropical congestus, deep convective, or stratiform clouds derived from the scanning rain radar compare very well to those derived from a vertically pointing cloud radar (Kumar et al. 2013a).

It should be mentioned at this point that the observational data to which we are comparing the SMCM-simulated cloud fractions are also subject to uncertainties and give room for interpretation. The most prominent uncertainty, of course, is the estimation of rain rates from radar echoes, which is not too straightforward itself, and the subsequent assumption that the area of a particular type of rainfall [derived after Steiner et al. (1995)] is equal to the cloud fraction of that particular cloud type. Therefore, this analysis is limited to precipitating clouds only. Also, land surface characteristics of the geographical area covered by the large-scale observational dataset used in this study are far from homogeneous. The CPOL radar at Gunn Point covers both water and land surfaces, with some of the land surface areas being subject to a pronounced convective diurnal cycle that results in some of the deepest convection on the planet (Keenan et al. 1990; Crook 2001). As these events are locally driven, environmental conditions leading to their initiation cannot be represented in the observational dataset. This uncertainty in environmental conditions obviously does not apply to the Kwajalein data.

### c. Deriving model forcing parameters from the observations

The evolution of the cloud ensemble as simulated by the SMCM with respect to the large-scale atmospheric state is designed to be dependent on two predictors. One parameter is used as a proxy for the environment's potential to develop and sustain convection *C* and the other is used as a proxy for midtropospheric dryness *D*. Here, the underlying assumption is that convection is initiated (sustained) and hindered (depleted) by high values of *C* (*D*). Because we aim to use the SMCM in a diagnostic manner by forcing it with an observed large-scale atmospheric state, we have to derive *C* and *D* from the available observational data. This requires us to adapt the formulas for calculating *C* and *D* as given in KBM10 as these are defined to be used for a large-scale state given by the simple two-layer model (Majda and Shefter 2001; Khouider and Majda 2006).

As mentioned above, *C* and *D* are used as proxies for the convective potential of the tropospheric column and midtropospheric dryness, respectively. In the original SMCM these quantities are scaled to vary roughly between 0 and 2. For the evaluation of the SMCM, we derive a total of five (instead of just two) forcing predictors. We proceed in this way because there may exist a multitude of possible predictor constellations for adequately describing the dependency of tropical convection on the large-scale atmospheric state.

#### 1) *C*—A proxy value for convective propensity

In the original formulation given in KBM10, *C* is given by the scaled CAPE (calculated for a parcel lifted from 990 hPa; *C _{C}* in the following). CAPE corresponding to the time series shown in Fig. 1 yields values in the range 0–1700 J kg

^{−1}; we therefore scale the CAPE values by 1000 J kg

^{−1}to achieve the desired range of

*C*∈ [0; 2].

_{C}*C*, represented by scaled values of either the ratio of low-level CAPE (LCAPE; i.e., CAPE integrated only to the freezing level), to total CAPE (

*C*

_{rC}), or large-scale vertical velocity at 500 hPa

*ω*

_{500}(

*C*):

_{ω}*C*and

_{C}*C*is relatively intuitive and straightforward, whereas the choice of

_{ω}*C*

_{rC}warrants explanation. KBM10 found that assuming that congestus activity is positively related to LCAPE (derived from a two-layer atmospheric model) rather than total CAPE improves the SMCM variability. However, our observations show that LCAPE alone is roughly constant throughout the whole observational period and it is only the ratio to total CAPE that resembles some relationship with observed convection. For illustrative purposes, we show the time series of

*C*for the subset of the data shown in Fig. 1 in the top two panels in Fig. 2.

Time series of model forcing predictors obtained from the large-scale state shown in Fig. 1. (top), (middle) Values for *C* (i.e., the proxy for convective propensity). (bottom) Values for *D* (i.e., the proxy for midtropospheric dryness). See text for calculation of the predictors.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Time series of model forcing predictors obtained from the large-scale state shown in Fig. 1. (top), (middle) Values for *C* (i.e., the proxy for convective propensity). (bottom) Values for *D* (i.e., the proxy for midtropospheric dryness). See text for calculation of the predictors.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Time series of model forcing predictors obtained from the large-scale state shown in Fig. 1. (top), (middle) Values for *C* (i.e., the proxy for convective propensity). (bottom) Values for *D* (i.e., the proxy for midtropospheric dryness). See text for calculation of the predictors.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Recalling the preceding short analysis of wet and dry periods (section 2b), the pattern of *C _{C}* (Fig. 2, top) reveals no evident correlation to these periods. The relatively high values of

*C*during the first 40 days of the time series should yield intense convective activity, but the observed cloud fractions do not support this. However, the observed low convective activity during those roughly 40 days could be explained by a relatively dry middle troposphere as indicated by the time series of

_{C}*D*

_{RH}, which may hinder the development of deep convection (e.g., Redelsperger et al. 2002). Furthermore, the wetter periods are characterized by low

*C*values throughout. However, stratiform cloud fraction, most probably originating from deep convection, is notably high during these periods. This supports a separate analysis of the present dataset, which indeed suggests that, in the area of interest, convective precipitation shows no significant correlation with CAPE (Davies et al. 2013). In fact, CAPE has been shown to be approximately anticorrelated with or be entirely unrelated to precipitation for regions in relatively close proximity to the areas covered by our dataset (Mapes and Houze 1992; McBride and Frank 1999; Sobel et al. 2004).

_{C}When convective activity is high, *C*_{rC} exhibits large values (cf. Figs. 1 and 2), implying that in situations of intense convection the total CAPE is dominated by the contribution coming from below the freezing level. Because low-level CAPE itself does not vary too much, it is the lack of contributions to total CAPE coming from above the freezing level that makes up for high values of *C*_{rC}, consistent with the findings of McBride and Frank (1999), who concluded that high values of CAPE are dominated by contributions from above 600 hPa. High values of *C*_{rC} thus imply that during periods of intense convection, such as those shown in Fig. 1, the specific heating profile of stratiform precipitation (i.e., latent heating of the upper troposphere and evaporative cooling of the lower troposphere; e.g., Houze 1997) serves to adjust the lapse rate toward the moist adiabat. However, it is the occurrence of convection itself that may enforce high values of *C*_{rC}, resulting in possible ambiguities when attempting to use it as a predictor for convection.

From a dynamical perspective, it is well known that large-scale vertical ascent, and thus moisture convergence, is associated with and facilitates the development of deep convection [cf. the recent study of Hohenegger and Stevens (2013)]. Like the convective area fractions shown in Fig. 1, the time series of *C _{ω}* also appears highly intermittent and seems to very closely follow the former. This is especially true for the first roughly 40 days of the time series in which the observed stratiform and convective cloud fractions are relatively low. During that particular period,

*C*shows relatively small values with higher ones occurring sparsely, indicating a weakly but somewhat constantly forced convective regime. However, ambiguities in establishing sound cause-and-effect relationships between

_{ω}*C*and convection are apparent for

*C*, which is directly related to large-scale convergence, which can in turn be considered as both a cause and consequence of convective heating. In fact, discussion of these ambiguities is one of the most persistent issues in the meteorological community. Ambiguities may also arise from the method to derive

_{ω}*C*itself. Vertical pressure velocity

_{ω}*ω*is the key parameter obtained from the variational analysis used to derive the large-scale atmospheric state we use here. Since the variational analysis itself is constrained by total areal rainfall,

*ω*is somewhat tuned to match observed rain rates. However, because we use area fractions, and not rain rates, of convective and stratiform rain in our analysis, the causal link to the data processing in the variational analysis is weak.

#### 2) *D*—A proxy for midtropospheric dryness

*θ*

_{e}_{,BL}is the equivalent potential temperature in the boundary layer,

*θ*

_{e}_{,m}is the equivalent potential temperature in the midtroposphere, and 15 K is a climatological-mean scaling factor (Khouider and Majda 2006). Here, the underlying assumption is that the difference between the equivalent temperatures as given in Eq. (2) is large when the middle troposphere is dry compared to the boundary layer. For the calculation of

*θ*

_{e}_{,BL}and

*θ*

_{e}_{,m}as the equivalent potential temperatures at 1000 and 500 hPa, respectively. To yield the desired range of

*D*, we introduce a simpler proxy for representing the midtropospheric dryness by use of the relative humidity at 500 hPa. Then,

*D*

_{RH}is given by

_{500}∈ [0; 1]. The resulting time series of

*D*calculated with both methods are shown in Fig. 2 (bottom).

Unlike the time series of *C*, the ones for *D* show a very high level of agreement. It is just for two short time periods when the values of *D*_{RH} disagree significantly, namely around 5 February and 10 April 2006 of the time series displayed in Figs. 1 and 2. These periods are relatively dry compared to the rest of the time series, with low values of relative humidity reaching down into the boundary layer. For these two cases, relatively high values of *D*_{RH} indicate a dry case, whereas the low (or even negative) values of *θ _{e}* occur throughout the tropospheric column down to the surface, thereby not yielding the anticipated large difference between

*θ*at 1000 and 500 hPa. Defining

_{e}*D*

_{RH}agrees very well with

*D*

_{RH}for all further analyses presented in this study. Also, KBM10 used

## 3. The observed mean convective state at Darwin and Kwajalein

Before assessing whether the mathematical framework of the SMCM is suitable for reproducing observed convective behavior of tropical convection, we first analyze the observations laid out in section 2b in a manner suitable for direct comparison with SMCM output. Given the specific values of the forcing parameters *C* and *D* (cf. section 2c), the birth–death process used in the SMCM yields stationary cloud fraction distributions of every cloud type. Hence, it is possible to calculate a 2D histogram of the stationary cloud fraction as a function of *C* and *D*. Examples of such equilibrium cloud fraction distributions for a given set of transition time scales are given in KBM10. Here, we therefore calculate joint histograms of observed convective and stratiform cloud fractions in the parameter space of observed values of *C* and *D* to enable a straightforward comparison between observed and modeled convective behavior.

We show such joint histograms of mean observed cloud fractions for three sets of forcing parameters, as well as their relative standard deviations and number of measurements, in Figs. 3–5, for Darwin and Kwajalein. In the three sets of forcing parameters, the midtropospheric dryness parameter is represented by *D*_{RH} and the convection parameter *C* is represented by *C _{C}*,

*C*

_{rC}, or

*C*. Because of the observational limitations mentioned above, we only analyze deep convective and stratiform cloud fractions and neglect congestus clouds in the context of this study.

_{ω}Joint histogram of observed cloud area fractions and relative standard deviations as function of large-scale variables *C _{C}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites: (top) deep convective clouds, (middle) stratiform clouds, and (bottom) sample size per bin. Only pixels having more than five observations are shown. The black markers denote the mean values of

*C*and

_{C}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Joint histogram of observed cloud area fractions and relative standard deviations as function of large-scale variables *C _{C}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites: (top) deep convective clouds, (middle) stratiform clouds, and (bottom) sample size per bin. Only pixels having more than five observations are shown. The black markers denote the mean values of

*C*and

_{C}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Joint histogram of observed cloud area fractions and relative standard deviations as function of large-scale variables *C _{C}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites: (top) deep convective clouds, (middle) stratiform clouds, and (bottom) sample size per bin. Only pixels having more than five observations are shown. The black markers denote the mean values of

*C*and

_{C}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C*_{rC} and *D*_{RH}. The black markers denote the mean values of *C*_{rC} and *D*_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C*_{rC} and *D*_{RH}. The black markers denote the mean values of *C*_{rC} and *D*_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C*_{rC} and *D*_{RH}. The black markers denote the mean values of *C*_{rC} and *D*_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C _{ω}* and

*D*

_{RH}. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C _{ω}* and

*D*

_{RH}. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 3, but for *C _{ω}* and

*D*

_{RH}. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

We only discuss the results for Darwin in detail. Generally, the data for Kwajalein show the same relationships as for Darwin, but with less frequent high values of the *C* parameter and generally smaller stratiform cloud fractions. The important finding to keep in mind is that convective and stratiform cloud area fractions show very similar behavior at both locations given a particular large-scale atmospheric state, justifying using the observations from both locations together to investigate cloud fractions simulated by the SMCM. These results are also not impaired by the inconsistent treatment of congestus cloud area fractions between the two regions (i.e., excluding such clouds for Darwin observations and necessarily including them for Kwajalein observations).

When we stratify the observational data using *C _{C}* as indicator for convective propensity (cf. Fig. 3), we obtain maximum area fractions for both cloud types for some of the smallest values of

*C*and

_{C}*D*

_{RH}, indicating relatively high convective activity for small values of CAPE and a moist middle troposphere. Most observations fall into a range spanning the lower half of both parameter ranges, also resulting in the lowest cloud area fraction variability (i.e., relative standard deviation) in that range. Similar results are presented in McBride and Frank (1999), who found an inverse relationship between CAPE and precipitation when analyzing data obtained during active and break monsoon periods for a location in the Gulf of Carpentaria.

When stratifying the observations according to either one of the other two choices for *C* (cf. Figs. 4 and 5), we obtain a completely different functional dependency of convective and stratiform cloud fractions on *C* and *D*. Using *C*_{rC} and *C _{ω}* as choices for

*C*leads to

maximum values for both cloud area fractions for highest values of

*C*,high and low cloud area fraction variability for low and high values of

*C*respectively,a sharp increase in cloud area fractions above a certain value of

*C*and low values of*D*, andmost observations for low values of

*C*spanning a wide range of*D*_{RH}values.

*C*), average cloud area fractions are small but exhibit large variability, indicating a somewhat stochastic behavior. This is particularly interesting because a large part of the observations yields weak forcing, which would normally act to reduce sample variability. The stronger the forcing of convective activity gets, the fewer observations are registered per bin, suggestive of an expected increase in sample variability. However, cloud area fraction variability is lowest for strong forcing of convection, suggesting a more and more deterministic behavior of convection with increasing forcing, in line with other results derived from the same dataset (Davies et al. 2013). Physically, this implies that as forcing is weak, convection occurs more randomly in the domain, inducing large-scale convergence itself, which may lead to more large-scale organization and greater area fraction of convection. Examples for this behavior could be forcing of convection by local heterogeneities in land surface properties (and thus surface fluxes) or effects of land–sea breezes. This could in fact lead to a positive feedback loop on various scales. However, assessing the effect that this kind of feedback behavior would have on the vertical ascent on the scales considered here (i.e., 190 × 190 km

^{2}) is beyond the scope of our study.

Our results, however, do not support the idea that the stochastic component of unresolved subgrid-scale processes scales linearly with their mean response as put forward in earlier studies (e.g., Buizza et al. 1999; Shutts and Palmer 2007). The sharp increase in cloud area fraction above a certain value of *C*, accompanied by low values of *D*, is consistent with the “threshold behavior” of convection as laid out in, for instance, Peters and Neelin (2006). Furthermore, the histograms we show in Figs. 4 and 5 indicate that at least for these two choices of *C*, deep convective as well as stratiform area fractions are anticorrelated with dryness at midlevels, broadly consistent with earlier findings from observational studies (Redelsperger et al. 2002; Derbyshire et al. 2004; Takemi et al. 2004; Takayabu et al. 2010).

We also note that regimes exhibiting both a strong forcing of convection and a dry middle troposphere basically do not exist at the locations considered in this study. This may be obvious, but such a result is not apparent from Fig. 3 where there still exist a quite large number of measurements yielding a combination of a dry middle troposphere and high values of *C _{C}*.

Histograms similar to those shown in Figs. 3–5 could in principle also be derived from GCM output. Because mass flux–based convective parameterization closures currently employed in state-of-the-art GCMs do not normally predict a parameter that directly corresponds to the deep convective area fraction that we use here, one could, for example, analyze the functional dependence of the total mass flux itself on large-scale parameters. Such an analysis, however, is beyond the scope of this study and will be left for future work.

## 4. Reproducing observed convective behavior using the SMCM

### a. Adjusting the model parameters

The equilibrium cloud fractions of the multistate Markov chain used in the SMCM are calculated by analytically determining its stationary equilibrium distribution (cf. KBM10 for details). The purpose of this section is thus not to evolve the SMCM in time but to arrive at analytically exact equilibrium cloud fractions that allow for a tuning of the model to the observations presented in section 3. By the ergodic theorem of Markov chains, the equilibrium measure is unique and coincides with long-time-average area fractions of the three cloud types for each given set of model parameters (i.e., the transition time scales and the large-scale forcing). We show results from evolving tuned versions of the SMCM in time in section 4b. Here, the analytically derived equilibrium distribution is represented by area fractions for each of the four allowed states of the Markov chain (i.e., clear sky, congestus, deep convection, or stratiform clouds). The sum of all four area fractions for each pair of discrete *C* and *D* values is 1 and the distribution of area fractions among the four states can be adjusted by manipulating the transition time scales associated with the transition from one state to another.

In previous publications, the transition time scales used in the SMCM either were chosen in an ad hoc but physically meaningful manner (KBM10) or were picked to improve the intermittency of the simulated convection in idealized experiments (FMK12). Here we use observations to gauge the applicability of the chosen time scales to represent observed convective behavior. For reference purposes, we show the joint histograms of the analytically derived equilibrium deep convective area fractions for the transition time scales introduced in KBM10 and FMK12 (cf. Table 1) in Fig. 6. These joint histograms clearly indicate that the previously used transition time scales are not suited for reproducing the statistics of observed convection laid out in section 3 for several reasons. First, the transition time scales used in case 1 of KBM10 and in FMK12 yield equilibrium deep convective area fractions about an order of magnitude larger than those observed. Second, the transition time scales used in case 2 of KBM10 result in a deep convective area distribution unsuitable for reproducing observed behavior.

Transition time scales (h) as used in the SMCM. The three leftmost columns contain the transition time scales introduced in previous studies (KBM10, FMK12), yielding the equilibrium deep convective area fraction distributions in Fig. 6. The three rightmost columns contain the visually derived “best fitting” transition time scales for each of the three convection proxies leading to the modeled equilibrium cloud fractions in Fig. 7 at Darwin.

Analytical equilibrium deep convective area fraction of the SMCM's birth–death process given the two sets of transition time scales introduced in KBM10 and FMK12 (Table 1): (left),(middle) case 1 and 2 time scales of KBM10, respectively, and (right) time scales used in FMK12. For the two cases of KBM10, the transition from deep convective to stratiform area depends on *C*. See text and KBM10 for details regarding the calculation of equilibrium area fractions.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Analytical equilibrium deep convective area fraction of the SMCM's birth–death process given the two sets of transition time scales introduced in KBM10 and FMK12 (Table 1): (left),(middle) case 1 and 2 time scales of KBM10, respectively, and (right) time scales used in FMK12. For the two cases of KBM10, the transition from deep convective to stratiform area depends on *C*. See text and KBM10 for details regarding the calculation of equilibrium area fractions.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Analytical equilibrium deep convective area fraction of the SMCM's birth–death process given the two sets of transition time scales introduced in KBM10 and FMK12 (Table 1): (left),(middle) case 1 and 2 time scales of KBM10, respectively, and (right) time scales used in FMK12. For the two cases of KBM10, the transition from deep convective to stratiform area depends on *C*. See text and KBM10 for details regarding the calculation of equilibrium area fractions.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

To obtain a model that is most suitable for reproducing the observed convective behavior, we systematically adjust the transition time scales until we arrive at a close visual match between the analytical equilibrium solution of the SMCM and the observed mean deep convective cloud fractions for each convective proxy (*C _{C}*,

*C*

_{rC},

*C*) for Darwin shown in Figs. 3–5 (we only use data for Darwin here to test the robustness of the adjusted transition time scales by applying it to the Kwajalein data in the next section). This close match should ideally agree to the general cloud fraction distribution in

_{ω}*C–D*space in both magnitude and shape. Additionally, the equilibrium area fraction calculated for the mean observed

*C*and

*D*values (black dots in Figs. 3–5) should also match closely. The second requirement achieves a tuning of the model to the “mean observed climate,” thus yielding an optimal representation of observed tropical convective cloud distribution, given that the cloud-type relationships imposed in the SMCM correspond to those in nature. We find that it proves difficult to adequately satisfy both conditions, leading to a trade-off of getting either the mean climate or the maxima right. In general, we focus on arriving at the correct mean climate cloud fractions as this is of higher relevance regarding a possible future implementation into GCMs. The final “best fit” transition time scales for each convective proxy

*C*are listed in Table 1 and a comparison of modeled equilibrium and observed mean deep convective area fractions as

*f*(

*C*,

*D*) is displayed in Fig. 7. Given the number of transition time scales (seven), we acknowledge that our approach toward tuning the SMCM appears simple and ad hoc and that there may be more sophisticated and quantitative methods. However, we are confident about the time scales we arrive at, as these do represent the visually closest match possible and any deviations from the chosen sets lead to substantial deterioration of the resulting equilibrium cloud fraction distributions shown in Figs. 3–5.

(left) Joint histograms of analytically computed equilibrium deep convective area fractions of the SMCM and (right) the relative difference to observed mean deep convective area fractions at Darwin as a function of large-scale variables (top) *C _{C}*, (middle)

*C*

_{rC}, and (bottom) C

_{ω}and

*D*

_{RH}. SMCM cloud fractions for each version of

*C*correspond to the transition time scales shown in Table 1. Only histogram boxes having more than five observations are shown. The markers denote the mean observed values of

*C*,

_{C}*C*

_{rC}, and

*C*and

_{ω}*D*

_{RH}at Darwin, respectively.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

(left) Joint histograms of analytically computed equilibrium deep convective area fractions of the SMCM and (right) the relative difference to observed mean deep convective area fractions at Darwin as a function of large-scale variables (top) *C _{C}*, (middle)

*C*

_{rC}, and (bottom) C

_{ω}and

*D*

_{RH}. SMCM cloud fractions for each version of

*C*correspond to the transition time scales shown in Table 1. Only histogram boxes having more than five observations are shown. The markers denote the mean observed values of

*C*,

_{C}*C*

_{rC}, and

*C*and

_{ω}*D*

_{RH}at Darwin, respectively.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

(left) Joint histograms of analytically computed equilibrium deep convective area fractions of the SMCM and (right) the relative difference to observed mean deep convective area fractions at Darwin as a function of large-scale variables (top) *C _{C}*, (middle)

*C*

_{rC}, and (bottom) C

_{ω}and

*D*

_{RH}. SMCM cloud fractions for each version of

*C*correspond to the transition time scales shown in Table 1. Only histogram boxes having more than five observations are shown. The markers denote the mean observed values of

*C*,

_{C}*C*

_{rC}, and

*C*and

_{ω}*D*

_{RH}at Darwin, respectively.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As expected from the observed mean cloud fractions as *f*(*C*, *D*), we find that matching the SMCM equilibrium cloud fractions to the observed cloud fractions stratified by *C _{C}* results in starkly different time scales compared to the other two convection proxies (Table 1). However, all three sets of best-fit transition time scales preserve an important constraint laid out in KBM10, namely that cloud decay acts on identical or longer time scales than cloud formation. It must be kept in mind, however, that these best-fit time scales were found by visually matching the joint histograms of modeled and observed area fractions.

The joint histograms displayed in Fig. 7 indicate that each of the three analytical equilibrium deep convective area distributions corresponding to the best-fit transition time scales in Table 1 has some difficulty in reproducing certain aspects of the corresponding observations at Darwin. For every version of *C*, the model overestimates deep convective area fraction for almost the entire range of considered combinations of *C* and *D*.

This overestimation is highest when using *C*_{rC} to stratify the observations; however, the overall functional relationship is captured (cf. Fig. 4). Using observations stratified by *C _{C}* to adjust the transition time scales yields higher modeled area fractions at nearly every considered

*C*–

*D*pair, with the degree of overestimation showing no functional dependence on

*C*and

*D*. Using

*C*, the SMCM's equilibrium distribution resembles the functional dependency of the observations well. Furthermore, the relative difference of modeled versus observed area fractions shows an evident dependency on

_{ω}*C*and

*D*. The model over- and underestimates deep convective area fractions for low and high values of

*C*, respectively. This transition from over- to underestimating the area fractions appears systematic and gradual, a promising result in terms of possible future model adjustments (see below). The modeled joint histograms in Fig. 7, however, do not show the capability of the SMCM concept to reproduce observed temporally resolved tropical convection; they are merely analytical equilibrium solutions of the SMCM's internal birth–death process given a particular set of transition time scales (cf. KBM10 and references therein for details).

*C*(and

_{ω}*C*

_{rC}), respectively, is not a matter of finding the correct transition time scales or of ill-formulated “transition rules” but is due to the functional dependency of transition rates on

*C*and

*D*. KBM10 formulate this dependency as

*x*being either

*C*or

*D*and Eq. (4) being directly linked to transition rates

*R*; for example,

*R*from cloud state

*a*to

*b*. This formulation leads pronounced changes in transition rates for small values of

*C*or

*D*with the response becoming less strong with increasing values of

*C*and

*D*. Therefore, the SMCM in its original formulation is not designed to reproduce the sharp increase in observed cloud fractions shown in Figs. 4 and 5 for higher values of

*C*. Alternative formulations of Γ(

*x*) could be sought to improve the SMCM's capability to reproduce observed cloud area fraction distributions. This will be investigated in future research.

### b. Applying the SMCM to observations

In this section, we use the three sets of observation-derived parameters discussed in sections 2c and 3 in combination with the best-fit transition time scales shown in Table 1 to perform simulations with the SMCM. We first quantitatively discuss the temporally resolved reproduction of cloud area fractions compared to observations in section 4b(1) and then carry out a more thorough statistical analysis in section 4b(2).

#### 1) SMCM temporally resolved tropical convection

We use the subsets of the data from the Darwin and Kwajalein locations introduced in section 4b to compare the time series of observed cloud area fractions to those modeled by the SMCM for illustrative purposes. As we obtained the best-fit transition time scales shown in Table 1 from analyzing just Darwin data, application of these time scales to Kwajalein provides a strong test for our method. We force the SMCM with each of the three combinations of *C _{C}*,

*C*

_{rC}, and

*C*with

_{ω}*D*

_{RH}. The internal model time step is set to 5 min. The 6-hourly observations were linearly interpolated to match the model time step. The subgrid-scale lattice of the SMCM is set up to have 20 × 20 sites. As the whole domain covers an area of about 190 × 190 km

^{2}, each lattice site thus has an edge length of about 10 km. There is currently no fixed spatial scale for an individual lattice point considered in the formulation of the SMCM. Preliminary analysis shows that an increase in lattice sites, and the reduction of lattice size going with it, reduces the simulated temporal variability compared to observations but has no effect on correlations. This is mainly an effect of the SMCM-modeled cloud fractions approaching the deterministic limit of the Markov process (cf. KBM10).

From a GCM convection parameterization perspective it thus seems attractive to use SMCM lattice sites with a globally uniform fixed spatial scale (e.g., 1 × 1 km^{2}), leading to increased convective variability with increasing GCM resolution (i.e., reduced GCM gridbox size). This would yield a more realistic representation of convection compared to current deterministic schemes.

The resulting modeled time series of deep convective cloud area fractions for Darwin and Kwajalein are shown in Figs. 8 and 9, with the observed time series included for reference purposes. We show neither observed and modeled congestus nor stratiform cloud fractions because our main interest lies in assessing the representation of deep convection as this is our current target for GCM convection parameterizations.

Observed and SMCM-modeled time series of deep convective area fraction over Darwin during the time period 10 Nov 2005–18 Apr 2006. SMCM time series are obtained by forcing the SMCM with the observed *C* and *D* parameters introduced in section 2c and the transition time scales shown in Table 1. Results indicate one possible solution of the stochastic modeling approach.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Observed and SMCM-modeled time series of deep convective area fraction over Darwin during the time period 10 Nov 2005–18 Apr 2006. SMCM time series are obtained by forcing the SMCM with the observed *C* and *D* parameters introduced in section 2c and the transition time scales shown in Table 1. Results indicate one possible solution of the stochastic modeling approach.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Observed and SMCM-modeled time series of deep convective area fraction over Darwin during the time period 10 Nov 2005–18 Apr 2006. SMCM time series are obtained by forcing the SMCM with the observed *C* and *D* parameters introduced in section 2c and the transition time scales shown in Table 1. Results indicate one possible solution of the stochastic modeling approach.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 8, but over Kwajalein during the time period 2 May 2008–31 Jan 2009.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 8, but over Kwajalein during the time period 2 May 2008–31 Jan 2009.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As in Fig. 8, but over Kwajalein during the time period 2 May 2008–31 Jan 2009.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

We first consider the observed and modeled deep convective area fractions over Darwin shown in Fig. 8 as we have adjusted the model parameters of the SMCM specifically for this location. Forcing the SMCM with *C _{C}* results in more or less constant convective cloud area fractions showing no resemblance of the different regimes found in the observations. Because of the nonnegative and mostly nonzero values of the

*C*time series (cf. Fig. 2), the SMCM cannot reproduce the intermittency of cloud area fractions found in the observations. The same issue is apparent when forcing the SMCM with

_{C}*C*

_{rC}. However, periods of higher modeled deep convective cloud fraction seem to loosely correspond to periods of higher observed fractions, giving slightly more confidence in using

*C*

_{rC}over

*C*.

_{C}The results from using *C _{ω}* to force the SMCM show substantially more agreement with the observations, with

*C*leading to more variability during periods of low convective activity, especially during the first month or so of the considered time period. Analysis of correlations between modeled and observed deep convective area fraction time series and probability distribution functions thereof confirms that using

_{ω}*C*as proxy for convective propensity results in a better representation of convective behavior compared to the other two proxies (not shown). Despite these encouraging results, the issues raised toward the end of section 4 are apparent. For periods of weak forcing, the SMCM produces too high a deep convective cloud fraction whereas cloud fractions during strongly forced periods are substantially underestimated compared to observations. This is exactly what is to be expected from the modeled equilibrium cloud fractions shown in Fig. 7.

_{ω}The observed and modeled time series of deep convective area fraction for the Kwajalein area (Fig. 9) generally show the same behavior as the ones for the Darwin area (Fig. 8). In particular, the over- and underestimation of deep convective area fractions for small and large values of *C _{ω}*, respectively, is evident. Nevertheless,

*C*proves to be the parameter of choice for reproducing deep convective features over Kwajalein with the SMCM. Considering that we did not use the Kwajalein data to adjust the transition time scales in the SMCM in the preceding part of this paper, this result confirms the findings presented in section 3, namely that convection over Kwajalein shows similar functional dependencies to the large-scale environment as does convection over Darwin. Furthermore, this result indicates that at least in the framework of the SMCM, tropical convection acts on similar time scales for both tropical locations considered here. It is, however, important to keep in mind the possible ambiguities when attempting to establish cause-and-effect relationships between the large-scale state and convection when using

_{ω}*C*(cf. section 3).

_{ω}#### 2) Statistics of SMCM-modeled versus observed tropical convection

We now analyze the SMCM-modeled tropical convection to quantify the capability of the SMCM framework to reproduce the observed statistical properties of deep convective and stratiform area fractions laid out in section 3 as well as the actual stochasticity of the modeled convection. For the sake of brevity, we limit this analysis to experiments in which convection in the SMCM is determined by *C _{ω}*. We choose to do so because the SMCM versions using the two other parameters

*C*and

_{C}*C*

_{rC}were shown unsuitable for reproducing the basic temporal behavior of convection [cf. section 4b(1)].

Similar to the analysis of observed convection presented in section 3, we stratify the modeled time series of deep convective and stratiform area fractions by the values of *C _{ω}* and

*D*

_{RH}used for forcing the model. To ensure comparability with the observations, we average the modeled area fractions over 6-h periods centered over each time step of the observed large-scale atmospheric state. Similar to the histograms shown in Figs. 3–5, we show the results obtained for Darwin and Kwajalein separately in Fig. 10, again providing a test for the validity of the chosen transition time scales for both locations.

Joint histogram of modeled cloud area fractions and relative standard deviations as function of large-scale variables *C _{ω}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites derived from sampling the modeled cloud area fraction time series using all the available forcing data from observations (cf. section 4b) and the transition time scales from Table 1. Only pixels having more than five observations are shown. (top) Deep convective clouds and (bottom) stratiform clouds. Sample sizes per bin are as in Fig. 5. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Joint histogram of modeled cloud area fractions and relative standard deviations as function of large-scale variables *C _{ω}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites derived from sampling the modeled cloud area fraction time series using all the available forcing data from observations (cf. section 4b) and the transition time scales from Table 1. Only pixels having more than five observations are shown. (top) Deep convective clouds and (bottom) stratiform clouds. Sample sizes per bin are as in Fig. 5. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

Joint histogram of modeled cloud area fractions and relative standard deviations as function of large-scale variables *C _{ω}* and

*D*

_{RH}at the (left) Darwin and (right) Kwajalein sites derived from sampling the modeled cloud area fraction time series using all the available forcing data from observations (cf. section 4b) and the transition time scales from Table 1. Only pixels having more than five observations are shown. (top) Deep convective clouds and (bottom) stratiform clouds. Sample sizes per bin are as in Fig. 5. The black markers denote the mean values of

*C*and

_{ω}*D*

_{RH}.

Citation: Journal of the Atmospheric Sciences 70, 11; 10.1175/JAS-D-13-031.1

As expected, the joint histogram of SMCM deep convective area fractions obtained from the modeled time series of the Darwin location very much resembles that of the analytically derived equilibrium area fraction for the same set of transition time scales (Fig. 7, bottom). These statistics of the modeled time series more clearly reveal the shortcomings of the SMCM framework in reproducing observed convection already mentioned in sections 4a and 4b(1). The order of magnitude of deep convective area fraction is generally well captured, with the SMCM over- and underestimating area fractions for weak and strong convective forcing, respectively. The same also holds for the simulated stratiform cloud fractions for the Darwin area, which we show here for illustrative purposes, mainly to highlight that the transition time scales that we determined in section 4a also yield sensible values for that cloud type. More importantly, the sample standard deviations of deep convective and stratiform area fractions of the modeled time series show similar behavior compared to those of the observations (i.e., area fractions show higher and lower variability for weaker and stronger convective forcing, respectively). Sensitivity tests in which we force “untuned” versions of the SMCM with the observed large-scale atmospheric state show that this behavior is in fact an intrinsic property of the SMCM framework and does not depend on the chosen set of transition time scales. The modeled time series underestimate the degree of variability throughout, however (note the different color scales in Fig. 10 compared to Fig. 5). So for the Darwin area, the SMCM framework is suitable for reproducing observed behavior of tropical convection, both in terms of deep convective and stratiform cloud area fractions and variability, as a function of the observed large-scale environment.

For the Kwajalein area, the joint histograms in Fig. 10 lead us to similar conclusions, thereby supporting the applicability of the SMCM framework to both tropical locations considered here. However, because of the sparse sampling of strong convective forcing over Kwajalein, the overestimation of cloud area fractions for weak convective forcing dominates the statistics. As mentioned in section 4a, the sometimes substantial overestimation of cloud area fractions could be mediated by using alternative formulations of Eq. (4), which will be a topic of future research.

## 5. Summary and conclusions

This study was driven by the need for alternatives to the mostly deterministic convection parameterizations used in general circulation models (GCMs). For this, we first determined statistics of observed tropical convection over Darwin and Kwajalein stratified by environmental conditions. Then, we used these observed statistics to investigate whether the underlying framework of the stochastic multicloud model (SMCM; KBM10) is suitable for reproducing observed tropical convection, a prerequisite to using the underlying stochastic framework of the SMCM in a GCM convection parameterization.

We investigated the dependency of tropical convection, given by the fractional area coverage with deep convective or stratiform clouds, on a set of two proxy values obtained from the observed large-scale atmospheric state (derived by means of variational analysis; Davies et al. 2013). One proxy (*C*) represents the ability of the atmospheric column to initiate/sustain convection whereas the second proxy (*D*) represents midtropospheric dryness. As there exists no generally accepted theory of which environmental conditions actually lead to tropical convection, we used three different formulations for *C*: CAPE, the ratio of low-level CAPE (LCAPE; i.e., CAPE integrated up to the freezing level) to CAPE, and vertical velocity at 500 hPa. The value of *D* is obtained from relative humidity at 500 hPa.

We found that the relationship of observed cloud area fractions with CAPE is very different compared to the other two *C* proxies. We find the highest deep convective and stratiform cloud area fractions for low values of CAPE, supporting earlier findings that CAPE is approximately anticorrelated or only weakly correlated with tropical precipitation (Mapes and Houze 1992; McBride and Frank 1999; Sobel et al. 2004; Fletcher and Bretherton 2010). Here, the studies of Mapes and Houze (1992) and Sobel et al. (2004) are of particular interest as they base their findings on observations gathered at Darwin and Kwajalein, respectively. On the other hand, deep convective and stratiform cloud area fractions are positively correlated with the other two *C* proxies. The cloud area fraction distributions as function of *C* and *D* also revealed that, for those two *C* proxies,

high and low cloud area fraction variability occurs for low and high values of

*C*, respectively, implying that convection appears more random under weakly forced conditions and gets more and more deterministic with increasing forcing (cf. also Davies et al. 2013), thus contradicting the idea that the stochastic component of unresolved subgrid-scale processes scales linearly with their mean response (e.g., Buizza et al. 1999; Shutts and Palmer 2007);cloud area fractions increase sharply above a certain value of

*C*given low values of*D*, consistent with earlier reports on critical behavior of tropical convection (e.g., Peters and Neelin 2006);cloud area fractions show identical relationships to environmental conditions for both locations (Darwin and Kwajalein), albeit with starkly different boundary conditions (e.g., land–sea distribution, monsoonal forcing); and

deep convective and stratiform cloud area fractions are anticorrelated with midtropospheric dryness [consistent with Redelsperger et al. (2002), Derbyshire et al. (2004), Takemi et al. (2004), and Takayabu et al. (2010)].

*f*(

*C*,

*D*), especially for the data stratified by CAPE. Generally, the SMCM yields too high and too low a cloud fraction for weak and strong large-scale forcing, respectively. We found that the values of the tuning parameters leading to a sensible match with the observed convection also respect the general rules for cloud transition probabilities laid out in KBM10, an overall very encouraging result.

Using the parameter-adjusted SMCM, we simulated convective area fractions using the time series of the observed large-scale state. We thus applied the SMCM in a diagnostic fashion and found that the modeled area fractions of deep convective and stratiform clouds compare better to observations when using the convection proxy related to convergence (i.e., vertical velocity at 500 hPa) rather than those related to stability (i.e., total CAPE and the ratio of low-level to total CAPE). This is most probably related to the nonintermittent and positive-definite nature of the latter proxies, which does not allow for simulation of the intermittent cloud features found in the observations.

When using the convergence-based convection proxy to force the SMCM to generate time series of tropical convection, we found that the framework of the SMCM is capable of reproducing the overall functional relationships as well as the statistics of observed tropical convection well. In particular, the SMCM tropical convection also shows higher variability in weakly forced conditions compared to stronger forced conditions. The degree of variability is underestimated compared to observations though. We conjecture that the variability of the modeled convection would be higher if the SMCM were used in a prognostic framework rather than the diagnostic framework that we applied it to in this study. Furthermore, the 6-hourly time step of the observed large-scale state that we employ here may smear out part of the convective-scale variability, thus possibly constraining the stochastic process employed in the SMCM too strongly and also limiting the applicability to analysis of the diurnal cycle. Future work should thus also concentrate on deriving large-scale, long-term datasets of improved temporal resolution.

We acknowledge that there do exist ambiguities in establishing sound cause-and-effect relationships when attempting to relate tropical convection to large-scale convergence. Here, we argue for an integrated view of tropical convection over the entire range of atmospheric forcing strength of large-scale convergence. For weak (or even negative) convergence, convective area fractions are very small and show substantial variability about the mean (i.e., induced by very localized forcing such as land surface heterogeneities or land–sea breezes). However, these small-scale features induce some low-level convergence themselves, which may lead to more large-scale organization and greater area fraction of convection. This could in fact lead to a positive feedback loop on various scales. However, assessing the effect that this kind of feedback behavior would have on the vertical ascent on the scales considered here (i.e., 190 × 190 km^{2}) is beyond the scope of our study. In conclusion, we acknowledge the complex interactive relationship between convection and area-averaged vertical motion while noting that both directions in the interaction are likely at play and of importance with different weights based on the large-scale synoptic situation.

This study has shown that the stochastic concept behind the SMCM has the potential to underpin novel convection parameterizations in GCMs. As mass-flux convection parameterizations need to predict the vertical mass flux at cloud base, the concept of the SMCM would yield a stochastically based area and the updraft velocity could be given by another adequate formulation [e.g., such as that introduced in Jakob and Siebesma (2003)]. Furthermore, a reduction of GCM gridbox size (i.e., increasing resolution) would then lead to increased convective variability, making it superior to currently used deterministic convection schemes. Ultimately, future efforts will converge toward implementing a prototype version of a parameterization incorporating the SMCM framework into a full GCM.

## Acknowledgments

The authors thank the constructive comments of three anonymous reviewers and the editor, which helped improve the paper. The research of L.D. is supported by the Office of Science (BER), U.S. Department of Energy under Grant DE-FG02-09ER64742. The research of B.K. is supported by the Natural Sciences and Engineering Council of Canada. The research of A.J.M. is partially supported by the grants from the National Science Foundation and the office of the Naval Research.

## REFERENCES

Arakawa, A., 2004: The cumulus parameterization problem: Past, present, and future.

,*J. Climate***17**, 2493–2525.Bright, D., and S. Mullen, 2002: Short-range ensemble forecasts of precipitation during the southwest monsoon.

,*Wea. Forecasting***17**, 1080–1100.Buizza, R., M. Miller, and T. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF ensemble prediction system.

,*Quart. J. Roy. Meteor. Soc.***125**, 2887–2908.Charney, J., and Coauthors, 1979:

*Carbon Dioxide and Climate: A Scientific Assessment*. The National Academies Press, 22 pp.Crook, N., 2001: Understanding Hector: The dynamics of island thunderstorms.

,*Mon. Wea. Rev.***129**, 1550–1563.Dai, A., 2006: Precipitation characteristics in eighteen coupled climate models.

,*J. Climate***19**, 4605–4630.Davies, L., C. Jakob, P. May, V. V. Kumar, and S. Xie, 2013: Relationships between the large-scale atmosphere and the small-scale convective state for Darwin, Australia.

*J. Geophys. Res.,*doi:10.1002/jgrd.50645, in press.Derbyshire, S., I. Beau, P. Bechtold, J. Grandpeix, J. Piriou, J. Redelsperger, and P. Soares, 2004: Sensitivity of moist convection to environmental humidity.

,*Quart. J. Roy. Meteor. Soc.***130**, 3055–3079.Dorrestijn, J., D. Crommelin, A. Siebesma, and H. Jonker, 2013: Stochastic parameterization of shallow cumulus convection estimated from high-resolution model data.

*Theor. Comp. Fluid Dyn.,***27,**133–148, doi:10.1007/s00162-012-0281-y.Fletcher, J. K., and C. S. Bretherton, 2010: Evaluating boundary layer–based mass flux closures using cloud-resolving model simulations of deep convection.

,*J. Atmos. Sci.***67**, 2212–2225.Frenkel, Y., A. Majda, and B. Khouider, 2012: Using the stochastic multicloud model to improve tropical convective parameterization: A paradigm example.

,*J. Atmos. Sci.***69**, 1080–1105.Frenkel, Y., A. Majda, and B. Khouider, 2013: Stochastic and deterministic multicloud parameterizations for tropical convection.

, in press.*Climate Dyn.*Gillespie, D., 1975: An exact method for numerically simulating the stochastic coalescence process in a cloud.

,*J. Atmos. Sci.***32**, 1977–1989.Groenemeijer, P., and G. C. Craig, 2012: Ensemble forecasting with a stochastic convective parametrization based on equilibrium statistics.

,*Atmos. Chem. Phys.***12**, 4555–4565, doi:10.5194/acp-12-4555-2012.Hohenegger, C., and B. Stevens, 2013: Preconditioning deep convection with cumulus congestus.

,*J. Atmos. Sci.***70,**448–464.Houze, R., 1997: Stratiform precipitation in regions of convection: A meteorological paradox?

,*Bull. Amer. Meteor. Soc.***78**, 2179–2196.Jakob, C., and A. Siebesma, 2003: A new subcloud model for mass-flux convection schemes: Influence on triggering, updraft properties, and model climate.

,*Mon. Wea. Rev.***131**, 2765–2778.Keenan, T. D., B. R. Morton, X. S. Zhang, and K. Nyguen, 1990: Some characteristics of thunderstorms over Bathurst and Melville Islands near Darwin, Australia.

,*Quart. J. Roy. Meteor. Soc.***116**, 1153–1172, doi:10.1002/qj.49711649508.Keenan, T. D., K. Glasson, F. Cummings, T. Bird, J. Keeler, and J. Lutz, 1998: The BMRC/NCAR C-band polarimetric (C-Pol) radar system.

,*J. Atmos. Oceanic Technol.***15**, 871–886.Khouider, B., 2013: A coarse grained stochastic multi-type particle interacting model for tropical convection: Nearest neighbour interactions.

, in press.*Commun. Math. Sci.*Khouider, B., and A. Majda, 2006: A simple multicloud parameterization for convectively coupled tropical waves. Part I: Linear analysis.

,*J. Atmos. Sci.***63**, 1308–1323.Khouider, B., and A. Majda, 2008a: Equatorial convectively coupled waves in a simple multicloud model.

,*J. Atmos. Sci.***65**, 3376–3397.Khouider, B., and A. Majda, 2008b: Multicloud models for organized tropical convection: Enhanced congestus heating.

,*J. Atmos. Sci.***65**, 895–914.Khouider, B., A. Majda, and M. Katsoulakis, 2003: Coarse-grained stochastic models for tropical convection and climate.

,*Proc. Natl. Acad. Sci. USA***100**, 11 941–11 946.Khouider, B., J. Biello, and A. Majda, 2010: A stochastic multicloud model for tropical convection.

,*Commun. Math. Sci.***8**, 187–216.Klocke, D., R. Pincus, and J. Quaas, 2011: On constraining estimates of climate sensitivity with present-day observations through model weighting.

,*J. Climate***24**, 6092–6099.Kumar, V. V., C. Jakob, A. Protat, P. T. May, and L. Davies, 2013a: The four cumulus cloud modes and their progression during rainfall events: A C-band polarimetric radar perspective.

,*J. Geophys. Res.***118**, 8375–8389, doi:10.1002/jgrd.50640.Kumar, V. V., A. Protat, P. May, C. Jakob, G. Penide, S. Kumar, and L. Davies, 2013b: On the effects of large-scale environment and surface types on convective cloud characteristics over Darwin, Australia.

,*Mon. Wea. Rev.***141**, 1358–1374.Lin, J., and J. Neelin, 2000: Influence of a stochastic moist convective parameterization on tropical climate variability.

,*Geophys. Res. Lett.***27**, 3691–3694.Lin, J., and J. Neelin, 2003: Toward stochastic deep convective parameterization in general circulation models.

,*Geophys. Res. Lett.***30**, 1162, doi:10.1029/2002GL016203.Lin, Y., and Coauthors, 2012: TWP-ICE global atmospheric model intercomparison: Convection responsiveness and resolution impact.

*J. Geophys. Res.,***117,**D09111, doi:10.1029/2011JD017018.Majda, A., and M. Shefter, 2001: Models for stratiform instability and convectively coupled waves.

,*J. Atmos. Sci.***58**, 1567–1584.Majda, A., and B. Khouider, 2002: Stochastic and mesoscopic models for tropical convection.

,*Proc. Natl. Acad. Sci. USA***99**, 1123–1128.Majda, A., C. Franzke, and B. Khouider, 2008: An applied mathematics perspective on stochastic modelling for climate.

,*Philos. Trans. Roy. Soc.***366A**, 2427–2453.Mapes, B., and R. A. Houze, 1992: An integrated view of the 1987 Australian monsoon and its mesoscale convective systems. I: Horizontal structure.

,*Quart. J. Roy. Meteor. Soc.***118**, 927–963.May, P., J. Mather, G. Vaughan, C. Jakob, G. McFarquhar, K. Bower, and G. Mace, 2008: The Tropical Warm Pool International Cloud Experiment.

,*Bull. Amer. Meteor. Soc.***89**, 629–646.McBride, J., and W. Frank, 1999: Relationships between stability and monsoon convection.

,*J. Atmos. Sci.***56**, 24–36.Meehl, G., C. Covey, T. Delworth, M. Latif, B. McAvaney, J. Mitchell, R. Stouffer, and K. Taylor, 2007: The WCRP CMIP3 multimodel dataset: A new era in climate change research.

,*Bull. Amer. Meteor. Soc.***88**, 1383–1394.Murphy, J., D. Sexton, D. Barnett, G. Jones, M. Webb, M. Collins, and D. Stainforth, 2004: Quantification of modelling uncertainties in a large ensemble of climate change simulations.

,*Nature***430**, 768–772, doi:10.1038/nature02771.Palmer, T., 2001: A nonlinear dynamical perspective on model error: A proposal for non-local stochastic-dynamic parametrization in weather and climate prediction models.

,*Quart. J. Roy. Meteor. Soc.***127**, 279–304.Peters, O., and J. Neelin, 2006: Critical phenomena in atmospheric precipitation.

,*Nat. Phys.***2**, 393–396, doi:10.1038/nphys314.Pincus, R., C. Batstone, R. Hofmann, K. Taylor, and P. Glecker, 2008: Evaluating the present-day simulation of clouds, precipitation, and radiation in climate models.

*J. Geophys. Res.,***113,**D14209, doi:10.1029/2007JD009334.Plant, R., and G. Craig, 2008: A stochastic parameterization for deep convection based on equilibrium statistics.

,*J. Atmos. Sci.***65**, 87–105.Randall, D. A., and Coauthors, 2007: Climate models and their evaluation.

*Climate Change 2007: The Physical Science Basis,*S. Solomon et al., Eds., Cambridge University Press, 589–662.Redelsperger, J., D. Parsons, and F. Guichard, 2002: Recovery processes and factors limiting cloud-top height following the arrival of a dry intrusion observed during TOGA COARE.

,*J. Atmos. Sci.***59**, 2438–2457.Sherwood, S., 1999: Convective precursors and predictability in the tropical western Pacific.

,*Mon. Wea. Rev.***127**, 2977–2991.Shutts, G., and T. Palmer, 2007: Convective forcing fluctuations in a cloud-resolving model: Relevance to the stochastic parameterization problem.

,*J. Climate***20**, 187–202.Sobel, A. H., S. E. Yuter, C. S. Bretherton, and G. N. Kiladis, 2004: Large-scale meteorology and deep convection during TRMM KWAJEX.

,*Mon. Wea. Rev.***132**, 422–444.Steiner, M., R. Houze, and S. Yuter, 1995: Climatological characterization of three-dimensional storm structure from operational radar and rain gauge data.

,*J. Appl. Meteor.***34**, 1978–2007.Takayabu, Y., S. Shige, W. Tao, and N. Hirota, 2010: Shallow and deep latent heating modes over tropical oceans observed with TRMM PR spectral latent heating data.

,*J. Climate***23**, 2030–2046.Takemi, T., O. Hirayama, and C. Liu, 2004: Factors responsible for the vertical development of tropical oceanic cumulus convection.

*Geophys. Res. Lett.,***31,**L11109, doi:10.1029/2004GL020225.Teixeira, J., and C. Reynolds, 2008: Stochastic nature of physical parameterizations in ensemble prediction: A stochastic convection approach.

,*Mon. Wea. Rev.***136**, 483–496.Tompkins, A., and J. Berner, 2008: A stochastic convective approach to account for model uncertainty due to unresolved humidity variability.

*J. Geophys. Res.,***113,**D18101, doi:10.1029/2007JD009284.Waliser, D., and M. Moncrieff, 2007: Year of Tropical Convection—A joint WCRP–THORPEX activity to address the challenge of tropical convection.

*GEWEX News,*No. 17(2), International GEWEX Project Office, Silver Spring, MD, 8–9.Zhang, M., and J. Lin, 1997: Constrained variational analysis of sounding data based on column-integrated budgets of mass, heat, moisture, and momentum: Approach and application to ARM measurements.

,*J. Atmos. Sci.***54**, 1503–1524.