## 1. Introduction

Deep convection is an atmospheric process of major importance in Earth’s weather and climate system. Locally, it transports heat, moisture, and momentum vertically in the atmosphere (Arakawa 2004). Globally, it affects the large-scale circulation (Randall et al. 1989). Further, deep convection largely determines precipitation in the tropics. Of specific interest is its coupling to equatorial waves (e.g., equatorial Kelvin waves, Rossby waves, and the MJO) that largely determine the variability of precipitation (Kiladis et al. 2009; Wheeler and Kiladis 1999). Most GCMs do not resolve deep convection. Instead, this process is represented by parameterizations, assuming, for example, a cumulus ensemble that is in quasi equilibrium with the large-scale forcing (Arakawa and Schubert 1974).

Availability of larger computational resources allows GCMs to be run at finer resolutions. At horizontal grid resolutions below

The stochastic subgrid-process parameterization approach used in this paper has been introduced by Crommelin and Vanden-Eijnden (2008). The main idea behind this approach is to represent subgrid processes of an atmosphere or ocean model by stochastic processes of which the properties are inferred from high-resolution data prior to implementation. More specifically, the processes are represented by finite-state Markov chains with transition probability matrices that are estimated from data and are conditioned on the resolved model variables. In Crommelin and Vanden-Eijnden (2008), the conditional Markov chains (CMCs) were shown to adequately represent subgrid-scale variables in the Lorenz ‘96 model (Lorenz 1996). Using the same CMC approach in a GCM to parameterize convection is a challenging task.

In a GCM, both the large-scale and the subgrid-scale state are not single scalars as is the case in the Lorenz ‘96 model but, instead, are formed by various vertical profiles of resolved and subgrid variables respectively. Another difficulty is the availability of high-resolution data of convection. As explained by Crommelin and Vanden-Eijnden (2008), Markov chains can be inferred from high-resolution convection resolving model data as well as observational data. Inferring CMCs from high-resolution model data has been explored by Dorrestijn et al. (2013b) and Dorrestijn et al. (2013a).

Inspired by the stochastic multicloud model of Khouider et al. (2010), Dorrestijn et al. (2015) constructed a stochastic multicloud model on a two-dimensional square lattice, using CMCs inferred from observational data. The model was inferred from an extensive dataset, consisting of a combination of high-resolution data of deep convection (Kumar et al. 2013) and large-scale reanalysis data improved with observational data (Davies et al. 2013). The high-resolution (2.5 × 2.5 km^{2}) data originated from a rain radar located in the tropics (Darwin, Australia) and were available every 10 min for several months in a region of size ~1.5° × 1.5°. Thresholds for the cloud-top height and the rain rate were used for classification into a finite number of convective or stratiform cloud types (Dorrestijn et al. 2013a; Khouider et al. 2010). Observations of cloud-type transitions were used to estimate the transition probabilities of the CMCs. When conditioned on the large-scale vertical velocity and choosing 100 CMCs, the cloud-type area fractions of the scheme were comparable to the observational fractions in the radar domain. By varying the number of CMCs, the multicloud model could be adapted to the size of a GCM column, thereby making the parameterization scale aware.

In Gottwald et al. (2016), a similar data-driven stochastic scheme has been developed. Observational datasets from Darwin and Kwajalein were used to construct parameterizations of the convective area fraction

Testing the schemes in a dynamical environment, in which the CMCs are interacting with the resolved model variables in a GCM, is a necessary step in the development of the CMC-based schemes for the usage in state-of-the-art GCMs. Therefore, in the present paper, we show results of the implementation of the stochastic multicloud model of Dorrestijn et al. (2015), referred to as Dor15, and a scheme similar to the CMC scheme of Gottwald et al. (2016), referred to as Gott15, in a GCM of intermediate complexity; the climate model Simplified Parameterizations, Primitive Equation Dynamics (SPEEDY) (Molteni 2003; Kucharski et al. 2006, 2013).

The stochastic schemes produce

Our paper is organized as follows. In section 2, we describe the Dor15 scheme, followed by a description of the Gott15 scheme in section 3. Then, we explain how we implement the schemes in SPEEDY in section 4. We specify the observational datasets in section 5, and we present model results in section 6. A discussion follows in section 7.

## 2. The Dor15 scheme

The stochastic multicloud model consists of a 2D square lattice with *N* nodes, with at each node a CMC, denoted

^{−1}),

*N*CMCs yield area fractions

Ideally, one would like to choose *N* such that the size of the microgrid cells corresponds to the typical size of a convective updraft area *N* should be the ratio between the GCM horizontal grid size area *N* is a scaling parameter enabling the Dor15 scheme to adapt to the GCM grid resolution and determines the magnitude of the stochastic fluctuations of the area fractions *N*, the smaller the deviations from the expectation values, to which the fractions converge if ^{2}, and therefore,

More information about multicloud models can be found in, for example, Ajayamohan et al. (2014), Deng et al. (2015), Dorrestijn et al. (2013a, 2015), Frenkel et al. (2013), Khouider and Majda (2006), Khouider et al. (2010), Majda et al. (2007), and Peters et al. (2013).

## 3. The Gott15 scheme

In the Gott15 scheme, the CMCs switch between *N* CMCs in each GCM column). Thus, the scheme is less complex than the multicloud model but it is not scale aware. The fluctuations of

Again, we use the discretized Darwin radar dataset. The deep convective area fractions *k* means (MacQueen 1967; Gan et al. 2007) using

The Gott15 scheme is implemented in SPEEDY in the same way as the multicloud model:

## 4. Implementation in SPEEDY

SPEEDY is a GCM of intermediate complexity: only the most important processes are incorporated in the model, they are represented in a simplified way, and the GCM’s resolution is coarse (Kucharski et al. 2013). It is a hydrostatic spectral model that solves the primitive equations on the entire globe. The prognostic variables are vorticity, horizontal divergence, absolute temperature, surface pressure, and specific humidity. The time integration is performed by a leapfrog scheme and the time step in the standard version of SPEEDY is 40 min. In our version, the horizontal resolution is T30, referring to a triangular truncation at total wavenumber 30. The prognostic model fields are expanded into series of spherical harmonic functions of total wavenumber 30 and smaller. Along latitude circles these functions correspond to cosine and sine functions with maximum zonal wavenumber 30. This corresponds to a size of ~3.75° × 3.75° for each of the 96 × 48 = 4608 vertical columns. In the vertical, the model has eight pressure levels. SSTs are prescribed by using observational climatological fields, while land skin temperatures are prognosed using a soil model. SPEEDY has a seasonal cycle but no daily cycle. Simplified parameterizations are used to represent shortwave and longwave radiation, deep convection, clouds, surface heat and moisture fluxes, large-scale condensation, and vertical diffusion (representing, e.g., shallow convection). Precipitation is the sum of the large-scale and convective precipitation. The large-scale precipitation is derived from a large-scale condensation scheme and the convective precipitation is derived from the deep convection scheme.

The reason why we choose such a simplified GCM is that it provides a perfect playground to explore new stochastic concepts in convection parameterizations and the impact on the representation of intraseasonal variability caused by equatorial waves. In that respect this explorative study should be considered as a natural intermediate step from recent offline studies (Dorrestijn et al. 2015; Gottwald et al. 2016) toward an implementation into the state-of-the-art GCMs.

### a. The relaxation closure (CTRL)

The deep convection scheme is a simplified Tiedtke mass flux scheme (Tiedtke 1989). Convection in a grid column is triggered if the atmosphere is conditionally unstable with respect to the lowest model level and if the relative humidity in the two lowest model levels exceeds a critical value (Molteni 2003). In the standard version of SPEEDY, the cloud-base mass flux

### b. Implementation of the stochastic schemes

*ρw*

_{c}= 1 kg m

^{−2}s

^{−1}(Möbis and Stevens 2012). For the multicloud model, we will also test the influence of this particular choice by varying this updraft momentum. In one experiment, we set

^{−2}s

^{−1}while using

When the multicloud model is used, we evolve

In each vertical column, the input of the CMCs is the large-scale vertical velocity

When the Gott15 scheme is used, we evolve only 1 CMC in every column of SPEEDY, which directly yields

## 5. Observations

We will compare the model behavior of SPEEDY with observations. We will use two observational datasets. The first dataset is the Darwin radar dataset. We will compare ^{−2} s^{−1}.

The second observational dataset is the daily accumulated precipitation GPI dataset (1° × 1°) (Huffman and Bolvin 2013). Since SPEEDY has a resolution of ~3.75° × 3.75°, we average the observational precipitation values over blocks of this size.

## 6. Results

We run SPEEDY several times for 11 yr with different closures for

### a. at time step level

To get a first impression of the convective behavior of SPEEDY with the several closures, we show

In Fig. 2b, we see that the mass flux of CTRL is nonzero for specific time intervals, only when the trigger function is active (an inactive trigger function is indicated by a red dot at the horizontal axis). If the trigger function allows for convection, the mass flux is always close to 0.03 kg m^{−2} s^{−1}; CTRL has small variability. Further, there are periods when the trigger function switches convection on and off too rapidly—for example, from day 2 until day 5. The too-intermittent behavior of CTRL is due to the trigger function.

In Fig. 2c, we clearly see the discrete character of Dor15–100: only values that are integer multiples of ^{−2} s^{−1}, in this period of this particular realization, which suggests that the variability has improved compared to CTRL. Note that a zero mass flux can be the result of an inactive trigger function or a convective area fraction

In Fig. 2d, we see the mass flux produced by Dor15–500. Mass-flux values higher than 0.04 kg m^{−2} s^{−1} are rare. Deviations from the expectation values are expected to be smaller compared to the *N* even more, the time series start to resemble the series of CTRL. However, note that for the deterministic limit

In Fig. 2e, we see that Dor15–100w0.5 produces lower

Finally, the Gott15 scheme (Fig. 2f) produces ^{−2} s^{−1}. The general shape of the convective peaks looks quite realistic for this scheme. It is less intermittent than the multicloud and CTRL time series.

Clearly, compared to CTRL, the two stochastic schemes (Dor15 and Gott15) are better reproducing the fluctuations as observed in Darwin.

### b. The distribution of

In Fig. 3, the distributions of *y* axes are scaled logarithmically to make the tails of the distributions better visible. The Darwin histogram corresponds to a distribution that is approximately exponential with a maximum ^{−2} s^{−1}.

In Fig. 3b, we see that the mass flux of CTRL has a peak value at 0.03 kg m^{−2} s^{−1} and that the relative frequencies are rapidly decreasing to zero for larger mass fluxes. The maximum value lies below 0.05 kg m^{−2} s^{−1}. The mean mass flux of CTRL is larger than the mean mass flux observed in Darwin and the standard deviation is smaller. This is also the case if we evaluate the model data near Darwin instead of the entire tropical belt.

The mass flux of Dor15–100 (Fig. 3c) can attain values up to 0.10 kg m^{−2} s^{−1}. The discrete character of the scheme is visible, with only integer multiples of 0.01 kg m^{−2} s^{−1}. The mean flux is close to the mean flux of CTRL and its standard deviation is slightly larger. Dor15–500 (Fig. 3d) displays a histogram that resembles the histogram of CTRL, except that a higher maximum mass flux is possible. The histogram looks smoother than the histogram of Dor15–100, since integer multiples of 0.002 kg m^{−2} s^{−1} can be attained. The mean mass flux is lower than the mean mass flux of Dor15–100 and it has a smaller standard deviation. Dor15–100w0.5 produces lower

Gott15 (Fig. 3f) attains 10 different mass flux values, which are exactly the values of the 10 cluster centroids. Its maximum mass flux is around 0.07 kg m^{−2} s^{−1}; higher maximum values can be obtained, for example, by using a larger number of cluster centroids. This last option would, however, need reconstruction of the Gott15 scheme through a revised estimation of the transition matrices. The relative frequency of the bins of Gott15 seems to decrease approximately exponentially. The mean and standard deviation are close to the observational values.

We conclude that, compared to CTRL, the stochastic schemes (Dor15 and Gott15) produce mass flux distributions that are more similar to the Darwin distribution. However, the discrete character of the stochastic schemes is not very realistic.

### c. Autocorrelation functions

Deep convection is correlated in time and probabilities of the occurrence and strength of convection depend strongly on earlier time instances. This is one of the reasons why we choose to parameterize convection with Markov chain models: to be able to capture this correlation. How well the several closures reproduce observational correlations can be assessed by calculating autocorrelation functions (ACFs) (Dorrestijn et al. 2015).

In Fig. 4a, we plot ACFs of

### d. Precipitation

The daily accumulated precipitation is an important output of GCMs. We will assess the different mass flux closures by comparing the model’s precipitation output with observations. In Fig. 4b, we show the PDFs of the nonzero daily accumulated precipitation for 10 yr of data between 15°N and 15°S. Note the logarithmic scale of the *y* axis. We see that the PDF produced while using Gott15 is very close to the PDF of the GPI observations (1° × 1°) for precipitation values less than 50 mm day^{−1} and that higher values are too frequent. Its PDF is not so close to the GPI observations that are averaged over blocks of size 3.75° × 3.75°; only for precipitation values below 20 mm day^{−1} there is a good fit. Gott15 has been trained with data corresponding to an area of ~1.5° × 1.5°, which may explain why it is closer to GPI 1° × 1° than to GPI 3.75° × 3.75°.

The PDFs of Dor15–100, Dor15–500, and CTRL are similar, but not close to the observational PDFs. Above 45 mm day^{−1}, the PDFs decrease with the correct slope compared to GPI (3.75° × 3.75°). The PDF of Dor15–100w0.5 differs from the PDF of Dor15–100, but it is still not close to the observational PDFs. For values higher than 50 mm day^{−1}, the PDF is close to the PDF of Gott15. In section 6f, we will further examine the impact of

In Fig. 5, we show 10-yr-averaged equatorial precipitation. The general patterns produced by SPEEDY (Figs. 5b–f) are somewhat similar to the GPI observations (Fig. 5a): a narrow ITCZ in the northeastern Pacific Ocean and a wide one over the Maritime Continent. However, there are some major errors: for example, the precipitation in CTRL, Dor15–100 and Dor15–500 in the northeastern Pacific Ocean, is twice as high as in GPI. Also SPEEDY’s spatial patterns in the Indian Ocean differ significantly from the patterns in GPI.

Dor15–100 (Fig. 5c) and Dor15–500 (Fig. 5d) do hardly change the precipitation patterns of CTRL (Fig. 5b). So, the schemes, based on different closures, produce similar 10-yr-average precipitation. This can be explained by realizing that precipitation scales with mass flux at cloud base. Inspection of Figs. 3b–d shows that the different closures give similar mean mass flux values of 0.02 kg m^{−2} s^{−1}. Dor15–100w0.5 and Gott15 produce significantly lower mean mass flux values, which explains the reduction of the intensity of the precipitation patterns (Figs. 5e–f). These schemes do not improve the patterns in general. Only the ITCZ in the northeastern Pacific Ocean seems to improve. Precipitation in the warm pool (140°E) is still too intense and too localized compared to the observations. We conclude that the intensity of

### e. Equatorial waves

At the equator, the Coriolis force vanishes, and it increases north and south of the equator. This results in dynamics that are typical for the tropics. The governing equations of atmosphere and ocean admit solutions that describe waves traveling along the equator. It is possible to discern atmospheric waves in satellite observations of precipitation because of their tendency to couple to deep convection.

A distinction can be made between waves that are mainly symmetrical with respect to the equator—for example, equatorial Kelvin waves traveling eastward with 15 m s^{−1} (or ^{−1}), equatorial Rossby waves (ER) traveling westward, westward inertio-gravity (WIG) waves, eastward inertio-gravity waves, and the MJO traveling eastward with 5 m s^{−1} (or ^{−1}) and waves with an antisymmetric structure with respect to the equator (e.g., mixed Rossby–gravity). For a comprehensive treatise on equatorial waves, we refer to Wheeler and Kiladis (1999). State-of-the-art GCMs should be able to reproduce these waves. Producing realistic equatorial waves (especially the MJO) is one of the major challenges for weather and climate modelers (Kiladis et al. 2009; Biello and Majda 2005).

Exactly as in Zhang (2005), we show in Fig. 6 longitude–time plots, also known as Hovmöller diagrams, of the equatorial daily precipitation averaged over 15°N–15°S for 1 yr of GPI observations and for the SPEEDY experiments. Hovmöller plots are useful to get a first insight in the model’s ability to produce equatorial waves.

The eastward-moving Kelvin waves are clearly visible for CTRL, Dor15–100, and Dor15–500 (Figs. 6b–d). In the observations (Fig. 6a), these Kelvin waves are visible but not as prominent. The Hovmöller diagrams of the multicloud model and CTRL are in general very similar. The multicloud model seems to produce slightly larger coherent structures of heavy rainfall, which are visible as tiny red blobs—for example, in January at 90° and 150°E in Fig. 6d. The MJO events in the GPI observations—for example, in February (60°E–180°)—are prominent and are missing in CTRL, Dor15–100, and Dor15–500. In the Hovmöller diagram of the Gott15 scheme (Fig. 6f), large convective events are present (e.g., the red blobs between 120°E and 180°), considerably more than in CTRL. We even see that MJO-like waves are present between 60°E and 180° in January. These MJO-like waves are, however, not as strong as in GPI, which indicates that the representation of spatial organization of convection is still inadequate.

The Hovmöller diagram of Dor15–100w0.5, with *ρw*_{c} = 0.5 kg m^{−2} s^{−1}, differs from the Hovmöller diagram of CTRL: the Kelvin waves are less prominent and structures of heavy rainfall can be seen (mainly between 60°E and 180°) that are similar to the structures of Gott15. Also the MJO-like waves are present (60°E–180°, July–August) but are even weaker than for Gott15.

To further examine the model’s ability to produce equatorial waves and investigate intraseasonal variability, we calculate Wheeler–Kiladis diagrams (Wheeler and Kiladis 1999) of the equatorial precipitation. We focus on the symmetric part of the precipitation, since we are mostly interested in equatorial Kelvin waves and the MJO—the waves with the largest contributions to intraseasonal variance in precipitation. We calculate zonal–wavenumber–frequency diagrams of the symmetric part of the equatorial precipitation (15°N–15°S) divided by the background spectrum, for which we apply smoothing with a 1–2–1 filter.

In Fig. 7, we plot the diagrams for the GPI observations (Lin et al. 2006) and the SPEEDY experiments. Note, first of all, that all the SPEEDY model results differ significantly from the GPI diagram. This is, besides the differences in the power of the waves, caused by the different background spectra by which the spectra are divided. In the observations (Fig. 7a), we clearly see the MJO peak (around zonal wavenumbers 1–5 with a period between 32 and 96 days) and the Kelvin waves for positive wavenumbers. Further, we see the ER and the WIG less prominently. The diagrams of CTRL, Dor15–100, and Dor15–500 (Figs. 7b–d) are very similar to each other and show too-prominent Kelvin waves while the MJO is essentially missing in these diagrams. These are typical model misrepresentations that occur in many state-of-the-art GCMs (Lin et al. 2006). Our multicloud scheme is not able to improve the MJO. Successful MJO-like simulation with similar stochastic multicloud models is possible as demonstrated by Deng et al. (2015).

In the diagram of the Gott15 (Fig. 7f), we see an MJO peak and the Kelvin waves are less prominent as in CTRL. The tropical depressions (TD type) are too prominent. For Dor15–100w0.5 (Fig. 7e), the Kelvin waves slightly diminish in comparison to Dor15–100, the TD type are even more prominent than in Gott15, and the MJO peak is missing.

### f. The updraft momentum at cloud base

In the implementation of the stochastic schemes in SPEEDY, *ρw*_{c} = 1 kg m^{−2} s^{−1} in (3). We find that changing

Figure 8a displays the result of 12 independent 11-yr runs of SPEEDY using the Dor15–100 scheme with different values of *ρw*_{c} ≈ 0.45 kg m^{−2} s^{−1}. The MJO power tends to increase for smaller updraft momentum values but never reaches the MJO observational power of 0.14. Note that for

With this novel method of calculating the power of the MJO and equatorial Kelvin waves, it is possible to express the model’s ability to simulate these waves in a single scalar. This enables modelers to directly tune parameters for optimal simulation of these waves. Note, however, that even if the powers are exactly equal to the observational powers, it is not yet sufficient to conclude that the model simulates the waves perfectly. Other requirements have to be fulfilled as well (Zhang 2005). The power only gives an impression. For example, CTRL gives a too-high equatorial Kelvin wave power, 0.12, and a too-low MJO power, 0.02, which is consistent with the patterns found in the Hovmöller diagrams (Figs. 6a,b).

## 7. Discussion

We have implemented two different stochastic parameterizations for the convective area fraction

In both stochastic parameterizations

On a local grid point level, both stochastic schemes produce mass flux time series that are more realistic than the series produced by the standard CTRL version (Fig. 2). This is also reflected in a broader and more realistic frequency of occurrence distribution of the cloud-base mass flux (Fig. 3). Gott15 and to a lesser extend Dor15 also improve the daily accumulated tropical precipitation compared to CTRL (Fig. 4b). Substantial improvement of the temporal autocorrelation function for

Wheeler–Kiladis diagrams show that the equatorial Kelvin waves are too prominent in SPEEDY for CTRL and that the MJO is missing entirely. Gott15 significantly improves the representation of both the MJO and the equatorial Kelvin waves. Dor15 is only able to improve on this issue by strongly reducing

How many of the model errors are due to the convection schemes and how much due to the large-scale forcings of SPEEDY? The range of

An advantage of the Dor15 scheme over the Gott15 scheme is that it can be adapted to the scale of the GCM grid column, which makes it more universal in usage. We have, however, seen that the results for the Gott15 scheme are better than for the more involved Dor15 scheme. The main difference between the two methods is that the Gott15 scheme has been trained with the macroscopic data (i.e., averaged *N* CMCs for which each CMC corresponds to the size of a convective updraft and

Including local interactions between neighboring cells in the Dor15 model could improve its performance but lies beyond the scope of this paper. Including spatial interactions makes the model more complicated, because for every configuration of the neighboring cells a different transition probability matrix is needed. For successful inclusion of spatial interaction we refer to Bengtsson et al. (2013), in which a cellular automata approach (deterministic and stochastic) is applied to make convection interact spatially between different grid boxes of a NWP model, leading to a more realistic representation of convective organization. Further, in Dorrestijn et al. (2013a) locally interacting CMCs have been inferred from LES data and in Khouider (2014) the multicloud model of Khouider et al. (2010) is extended by including spatial dependencies.

The Dor15 multicloud model is inspired by the multicloud model of Khouider et al. (2010). The models are similar because in both models CMCs are positioned on a microgrid and randomly switch cloud type with probabilities that depend on the large-scale forcing. The main difference between the models is that the transition probabilities of the Dor15 scheme are estimated from data while the transition probabilities used in Khouider et al. (2010) are derived by choosing typical time scales of formation of clouds, conversion between cloud types, and decay of clouds, which are based on physical intuition. Furthermore, in the multicloud model of Khouider et al. (2010), probabilities are conditioned on CAPE and middle-troposphere dryness instead of large-scale vertical velocity for Dor15. In Khouider et al. (2010), a stochastic coarse-grained birth–death system is derived for the multicloud model, such that each GCM column only uses one CMC, which makes the method very effective. Further, the model of Khouider et al. (2010) is scale aware because the number of lattice sites in the microgrid can be adapted to the GCM gridbox size. We conclude that the beneficial properties of both methods could be combined to obtain an even better model. Especially the inclusion of spatial dependencies as in the extension in Khouider (2014) is promising.

In some recent studies (Ajayamohan et al. 2014; Ragone et al. 2015), new convection parameterizations have been implemented in aquaplanet GCMs. SPEEDY can also run in aquaplanet mode, but for comparison to satellite observations, we have chosen to include land in the experiments.

A final remark on computational costs of the new stochastic schemes. The multicloud scheme, for which *N* CMCs have to be evolved for each grid column (including the generation of random numbers) increases the computational costs of the convective scheme substantially, while the computational burden of Gott15 is marginal. In Table 1, we list these computational costs. In GCMs with a large number of grid columns, using a large number of CMCs

Computational costs (seconds per model day) of SPEEDY with the several closures compared to CTRL. The third column shows the number of random numbers that has to be generated each model day. The last column shows the number of random numbers that has to be generated for each model column (4608 columns) each time step (10 min). Calculations are performed on a PC with 7.7-GB memory and a 2.5-GHz processor.

## Acknowledgments

The GPI dataset was provided by the NASA Goddard Space Flight Center’s Mesoscale Atmospheric Processes Laboratory. This research was supported by the Division for Earth and Life Sciences (ALW) with financial aid from the Netherlands Organization for Scientific Research (NWO). We are grateful to three anonymous reviewers for helpful comments.

## REFERENCES

Ajayamohan, R. S., , B. Khouider, , and A. Majda, 2014: Simulation of monsoon intraseasonal oscillations in a coarse-resolution aquaplanet GCM.

,*Geophys. Res. Lett.***41**, 5662–5669, doi:10.1002/2014GL060662.Arakawa, A., 2004: The cumulus parameterization problem: Past, present, and future.

,*J. Climate***17**, 2493–2525, doi:10.1175/1520-0442(2004)017<2493:RATCPP>2.0.CO;2.Arakawa, A., , and W. Schubert, 1974: Interaction of a cumulus cloud ensemble with the large-scale environment, Part I.

,*J. Atmos. Sci.***31**, 674–701, doi:10.1175/1520-0469(1974)031<0674:IOACCE>2.0.CO;2.Arakawa, A., , J.-H. Jung, , and C.-M. Wu, 2011: Toward unification of the multiscale modeling of the atmosphere.

,*Atmos. Chem. Phys.***11**, 3731–3742, doi:10.5194/acp-11-3731-2011.Bengtsson, L., , M. Steinheimer, , P. Bechtold, , and J.-F. Geleyn, 2013: A stochastic parametrization for deep convection using cellular automata.

,*Quart. J. Roy. Meteor. Soc.***139**, 1533–1543, doi:10.1002/qj.2108.Biello, J., , and A. Majda, 2005: A new multiscale model for the Madden–Julian oscillation.

,*J. Atmos. Sci.***62**, 1694–1721, doi:10.1175/JAS3455.1.Buizza, R., , M. Milleer, , and T. Palmer, 1999: Stochastic representation of model uncertainties in the ECMWF Ensemble Prediction System.

,*Quart. J. Roy. Meteor. Soc.***125**, 2887–2908, doi:10.1002/qj.49712556006.Crommelin, D., , and E. Vanden-Eijnden, 2008: Subgrid-scale parameterization with conditional Markov chains.

,*J. Atmos. Sci.***65**, 2661–2675, doi:10.1175/2008JAS2566.1.Davies, L., , C. Jakob, , P. May, , V. V. Kumar, , and S. Xie, 2013: Relationships between the large-scale atmosphere and the small-scale convective state for Darwin, Australia.

,*J. Geophys. Res. Atmos.***118**, 11 534–11 545, doi:10.1002/jgrd.50645.Deng, Q., , B. Khouider, , and A. Majda, 2015: The MJO in a coarse-resolution GCM with a stochastic multicloud parameterization.

,*J. Atmos. Sci.***72**, 55–74, doi:10.1175/JAS-D-14-0120.1.Deng, Q., , B. Khouider, , and A. Majda, 2016: Effect of stratiform heating on the planetary-scale organization of tropical convection.

,*J. Atmos. Sci.***73**, 371–392, doi:10.1175/JAS-D-15-0178.1.Dorrestijn, J., , D. Crommelin, , J. Biello, , and S. Böing, 2013a: A data-driven multi-cloud model for stochastic parametrization of deep convection.

*Philos. Trans. Roy. Soc. London*,**A371**, 20120374, doi:10.1098/rsta.2012.0374.Dorrestijn, J., , D. Crommelin, , A. P. Siebesma, , and H. Jonker, 2013b: Stochastic parameterization of shallow cumulus convection estimated from high-resolution model data.

,*Theor. Comput. Fluid Dyn.***27**, 133–148, doi:10.1007/s00162-012-0281-y.Dorrestijn, J., , D. Crommelin, , A. P. Siebesma, , H. Jonker, , and C. Jakob, 2015: Stochastic parameterization of convective area fractions with a multicloud model inferred from observational data.

,*J. Atmos. Sci.***72**, 854–869, doi:10.1175/JAS-D-14-0110.1.Frenkel, Y., , A. Majda, , and B. Khouider, 2013: Stochastic and deterministic multicloud parameterizations for tropical convection.

,*Climate Dyn.***41**, 1527–1551, doi:10.1007/s00382-013-1678-z.Frierson, D. M. W., 2007: Convectively coupled kelvin waves in an idealized moist general circulation model.

,*J. Atmos. Sci.***64**, 2076–2090, doi:10.1175/JAS3945.1.Gan, G., , C. Ma, , and J. Wu, 2007:

*Data Clustering: Theory, Algorithms, and Applications.*3rd ed. ASA-SIAM Series on Statistics and Applied Probability, SIAM, 466 pp., doi:10.1137/1.9780898718348.Gottwald, G. A., , K. Peters, , and L. Davies, 2016: A data-driven method for the stochastic parametrisation of subgrid-scale tropical convective area fraction.

, doi:10.1002/qj.2655, in press.*Quart. J. Roy. Meteor. Soc.*Huffman, G. J., , and D. T. Bolvin, 2013: Version 1.2 GPCP one-degree daily precipitation data set documentation. WDC-A, NCDC, GPCP 1DD, 27 pp. [Available online at ftp://meso.gsfc.nasa.gov/pub/1dd-v1.2/1DD_v1.2_doc.pdf.]

Khouider, B., 2014: A coarse grained stochastic multi-type particle interacting model for tropical convection: Nearest neighbour interactions.

,*Commun. Math. Sci.***12**, 1379–1407, doi:10.4310/CMS.2014.v12.n8.a1.Khouider, B., , and A. Majda, 2006: A simple multicloud parameterization for convectively coupled tropical waves. Part I: Linear analysis.

,*J. Atmos. Sci.***63**, 1308–1323, doi:10.1175/JAS3677.1.Khouider, B., , J. Biello, , and A. Majda, 2010: A stochastic multicloud model for tropical convection.

,*Commun. Math. Sci.***8**, 187–216, doi:10.4310/CMS.2010.v8.n1.a10.Kiladis, G., , M. Wheeler, , P. T. Haertel, , K. H. Straub, , and P. E. Roundy, 2009: Convectively coupled equatorial waves.

,*Rev. Geophys.***47**, RG2003, doi:10.1029/2008RG000266.Kucharski, F., , F. Molteni, , and A. Bracco, 2006: Decadal interactions between the western tropical Pacific and the North Atlantic Oscillation.

,*Climate Dyn.***26**, 79–91, doi:10.1007/s00382-005-0085-5.Kucharski, F., , F. Molteni, , M. King, , R. Farneti, , I.-S. Kang, , and L. Feudale, 2013: On the need of intermediate complexity general circulation models: A “SPEEDY” example.

,*Bull. Amer. Meteor. Soc.***94**, 25–30, doi:10.1175/BAMS-D-11-00238.1.Kumar, V. V., , C. Jakob, , A. Protat, , P. T. May, , and L. Davies, 2013: The four cumulus cloud modes and their progression during rainfall events: A C-band polarimetric radar perspective.

,*J. Geophys. Res. Atmos.***118**, 8375–8389, doi:10.1002/jgrd.50640.Kuo, H. L., 1965: On formation and intensification of tropical cyclones through latent heat release by cumulus convection.

,*J. Atmos. Sci.***22**, 40–63, doi:10.1175/1520-0469(1965)022<0040:OFAIOT>2.0.CO;2.Lin, J.-B., , and J. Neelin, 2000: Influence of a stochastic moist convective parameterization on tropical climate variability.

,*Geophys. Res. Lett.***27**, 3691–3694, doi:10.1029/2000GL011964.Lin, J.-L., and et al. , 2006: Tropical intraseasonal variability in 14 IPCC AR4 climate models. Part I: Convective signals.

,*J. Climate***19**, 2665–2690, doi:10.1175/JCLI3735.1.Lorenz, E. N., 1996: Predictability—A problem partly solved.

*Proc. Seminar on Predictability*, Reading, United Kingdom, ECMWF, 1–18.MacQueen, J., 1967: Some methods for classification and analysis of multivariate observations.

*Proc. Fifth Berkeley Symp. Mathematical Statistics and Probability*, Berkeley, CA, Statistical Laboratory of the University of California, 281–297.Majda, A., , S. Stechmann, , and B. Khouider, 2007: Madden-Julian Oscillation analog and intraseasonal variability in a multicloud model above the equator.

,*Proc. Natl. Acad. Sci. USA***104**, 9919–9924, doi:10.1073/pnas.0703572104.Möbis, B., , and B. Stevens, 2012: Factors controlling the position of the Intertropical Convergence Zone on an aquaplanet.

*J. Adv. Model. Earth Syst.*,**4**, M00A04, doi:10.1029/2012MS000199.Molteni, F., 2003: Atmospheric simulations using a GCM with simplified physical parametrizations. I. Model climatology and variability in multi-decadal experiments.

,*Climate Dyn.***20**, 175–191, doi:10.1007/s00382-002-0268-2.Palmer, T. N., 2001: A nonlinear dynamical perspective on model error: A proposal for non-local stochastic-dynamic parametrization in weather and climate prediction models.

,*Quart. J. Roy. Meteor. Soc.***127**, 279–304, doi:10.1002/qj.49712757202.Peters, K., , C. Jakob, , L. Davies, , B. Khouider, , and A. Majda, 2013: Stochastic behavior of tropical convection in observations and a multicloud model.

,*J. Atmos. Sci.***70**, 3556–3575, doi:10.1175/JAS-D-13-031.1.Plant, R., , and G. Craig, 2008: A stochastic parameterization for deep convection based on equilibrium statistics.

,*J. Atmos. Sci.***65**, 87–105, doi:10.1175/2007JAS2263.1.Ragone, F., , K. Fraedrich, , H. Borth, , and F. Lunkeit, 2015: Coupling a minimal stochastic lattice gas model of a cloud system to an atmospheric general circulation model.

,*Quart. J. Roy. Meteor. Soc.***141**, 37–51, doi:10.1002/qj.2331.Randall, D. A., , Harshvardhan, , D. A. Dazlich, , and T. G. Corsetti, 1989: Interactions among radiation, convection, and large-scale dynamics in a general circulation model.

,*J. Atmos. Sci.***46**, 1943–1970, doi:10.1175/1520-0469(1989)046<1943:IARCAL>2.0.CO;2.Teixeira, J., , and C. Reynolds, 2008: Stochastic nature of physical parameterizations in ensemble prediction: A stochastic convection approach.

,*Mon. Wea. Rev.***136**, 483–496, doi:10.1175/2007MWR1870.1.Tiedtke, M., 1989: A comprehensive mass flux scheme for cumulus parameterization in large-scale models.

,*Mon. Wea. Rev.***117**, 1779–1800, doi:10.1175/1520-0493(1989)117<1779:ACMFSF>2.0.CO;2.Wheeler, M., , and G. Kiladis, 1999: Convectively coupled equatorial waves: Analysis of clouds and temperature in the wavenumber–frequency domain.

,*J. Atmos. Sci.***56**, 374–399, doi:10.1175/1520-0469(1999)056<0374:CCEWAO>2.0.CO;2.Zhang, C., 2005: Madden-Julian Oscillation.

,*Rev. Geophys.***43**, RG2003, doi:10.1029/2004RG000158.