## 1. Introduction

Clouds play an important role in the earth’s climate. Classification of extensive satellite observations of clouds, as well as modeling of cloud dynamics and radiation processes in general circulation models, demands simple statistical parameterizations of broken cloud fields. Observations of cloud fields often are made along lines that transect the field. Such observations include, for example, lidar and ground-based sun-photometric measurements. The latter provide a binary time series indicating either presence or absence of cloud at a given moment in the line connecting the instrument and the sun [see the descriptions of cloud screening algorithms by Alexandrov et al. (2004) for the Multifilter Rotating Shadow-Band Radiometer (MFRSR) (cf. Harrison et al. 1994) and by Smirnov et al. (2000) for the CIMEL sun/sky radiometer operated by the Aerosol Robotic Network (AERONET) (cf. Holben et al. 1998)]. These time series have been used to provide statistical distributions of cloud cover and cloud overpass times, which can be converted into cloud chord length (CCL) distributions using wind speed estimates. If horizontal shapes of clouds can be assumed to be round (e.g., in the cumulus case), a CCL distribution can be converted into area distribution (Yau and Rogers 1984; Sauvageot et al. 1999). Similarly to the cloud chord length, distributions of lengths of clear-sky gaps between clouds can be also derived. Theoretical statistical studies of geophysical fields distributed along a line were performed by Sánchez et al. (1994) and Astin and Di Girolamo (1999). Cloud statistics from linear transects play an important role in stochastic radiative transfer models (Byrne 2005; Lane-Veron and Somerville 2004). Several studies have been published describing the cloud chord statistics based on both ground-based measurements (Lane et al. 2002; Berg and Kassianov 2008) and aircraft and satellite data (Plank 1969; Cahalan and Joseph 1989; Joseph and Cahalan 1990; Rodts et al. 2003). Some of these studies, especially those restricted to fair weather cumulus fields, report exponential cloud chord length distributions (Astin and Latter 1998; Lane et al. 2002), while others report power-law distributions (Cahalan and Joseph 1989; Koren et al. 2008). The observed cloud chord distribution depends on the dominant cloud type(s) during an observation period.

There are two main categories of cloud models widely used in simulating the spatial distribution of cloud condensate: dynamical and stochastic cloud models. Dynamical cloud models are physically based but require a lot of atmospheric parameters to be specified and are often computationally expensive. Stochastic cloud models are based on satellite, aircraft, or ground measurements and generate cloud fields close to the observed ones (Evans and Wiscombe 2004; Hogan and Kew 2005; Venema et al. 2006; Prigarin and Marshak 2009) and are computationally inexpensive. Some stochastic models generate cloud fields with an internal cloud structure (Schertzer and Lovejoy 1987; Cahalan 1994; Marshak et al. 1994; Schmidt et al. 2007), while others generate cloud fields as a binary mixture of cloudy and cloud-free areas (Su and Pomraning 1994; Zuev and Titov 1995; Prigarin et al. 2002). The latter have been used for stochastic radiative transfer modeling, which provides a radiation field averaged over many realizations of a stochastic cloud model (Sánchez et al. 1994; Titov 1990; Malvagi et al. 1993; Kassianov 2003; Lane-Veron and Somerville 2004).

Unlike a completely deterministic model of a cloud field (Aida 1977; Kite 1987; Breon 1992), which consists of identical clouds with centers on a regular grid, in a stochastic model the most straightforward assumption (approximation) of the distribution of cloudy and clear areas in a binary mixture is a homogeneous Markovian distribution. For this distribution, statistics of the medium are everywhere the same; thus, the probability of moving from cloudy areas to clear (or the reverse) along a line is independent of the position on the line (e.g., Byrne 2005). This assumption leads to an exponential distribution of cloud chord (and cloud separations) lengths, though not necessarily with the same characteristic lengths. For example, in the stochastic model of Su and Pomraning (1994) nonoverlapping clouds of fixed length randomly placed in an infinite 1D atmosphere exhibited exponential intercloud spacing distribution. Other models (Randall and Huffman 1980; Ellingson 1982; Ramirez and Bras 1990) allow for a variety of cloud sizes, dimensions, and altitudes.

We introduce a cloud-field model based on an alternative to the above described statistical models. Instead of randomly creating and then randomly placing a cloud into the model “atmosphere” (e.g., Su and Pomraning 1994), we consider the latter as a lattice with cells (of size *l*) that may either be occupied by a cloud with probability *p* or be clear with probability *q* = 1 − *p*. Lattice models like this are widely used in percolation theory (cf. Isichenko 1992). Such an approach was used for cloud modeling by Nagel and Raschke (1992), but in a different form than that presented here. The advantage of a lattice model is in parameterizing the cloud field as a whole, considering clouds and gaps between them together, and allowing the relation of such quantities as cloud cover (or fraction) and distributions of clouds and gaps by size, which appear at first glance to be independent. We further develop our model by allowing clouds to take arbitrary size and shape (instead of being clusters of several square cells) while retaining the basic statistical properties of the discrete-cell prototype (i.e., the probability of a cell of size *l* being occupied by cloud still equals *p*). Computations in the framework of the continuous model were performed in 1D case; however, their results are also applicable to the statistics of 2D cloud fields (which need to be reasonably statistically isotropic) sampled along 1D transects.

Physically, the cell model is justified by the observed structure of cumulus [e.g., Wielicki and Welch (1986)] and stratocumulus (Nicholls 1989) cloud fields. In particular, Wielicki and Welch (1986) report that the fair weather cumulus clouds (even as small as 1 km in diameter) observed in their Landsat-based study were often multicelled, with cell sizes ranging from 250 m to 1 km. The aircraft-based observations by Nicholls (1989) show the chord distribution of convective cells in stratocumulus cloud fields to be consistent with the hexagonal cell structure having average cell diameter ranging from 200 to 600 m (i.e., being approximately half of the boundary layer depth). A convective cell can be filled (completely or partially) with cloud or remain cloud-free depending on atmospheric conditions. While we do not pursue their physical nature in this study, the cells in our model can be considered as abstractions of the atmospheric convective cells. We will refer to them as cellular statistical model (CSM) cells.

In section 2 we describe the basic discrete cellular cloud model. The statistics of its continuous extension are presented in section 3. Section 4 describes the construction of numerical examples in 1D and 2D cases. The relationship of our construction to the Ising model is explained in section 5. We conclude with discussion and a summary in sections 6 and 7.

## 2. Discrete cloud field model with fixed occupancy probability

We start with presentation of the basic discrete-cell model, and then construct its continuous extension, which is more capable of describing realistic cloud fields.

### a. Infinite lattice

*p*(0 <

*p*< 1) or remain empty with probability

*q*= 1 −

*p*. We define a cloud of the size

*m*as a sequence of

*m*neighboring cloudy cells separated from other clouds by at least one clear cell at each end of the cloud. The statistical distribution of clouds by size is geometric:

*m*= 1, 2, 3, … . Indeed, the probability of finding a cloud of size

*m*is

*q*

^{2}

*p*and that of finding any cloud at all is

^{m}*qp*. Thus, the conditional probability of finding one, given that there is cloud there, is

*qp*

^{m−1}. Note, that the smallest cloud size is 1, even if

*p*→ 0. This distribution can be also derived from Bernoulli’s scheme of independent trials starting at a cloudy cell and repeated until the first failure (clear cell). It satisfies the normalization condition

### b. Finite-length samples

In the real-world observations, cloud field samples have finite size. As we will show below, the observed cloud length statistics are influenced by the sample size, especially for large clouds. Also, we need to consider overcast clouds (i.e., those with the length equal to or exceeding the sample size), which do not exist on the infinite lattice.

*N*cells, the cloud size distribution takes the form (derived in appendix A)

*m*<

*N*, while

*m*=

*N*. By construction,

*n*is normalized by

_{c}*N*→ ∞ (

*N*≫

*m*), the distribution (6) becomes geometric (1), while

*n*(

_{c}*N*) vanishes and the mean cloud size takes the form (3). In the opposite limit case of

*N*= 1, the mean cloud size (9) is expectedly equal to 1, and

*n*has only one value

_{c}*n*(

_{c}*N*) = 1. In the limiting case of

*p*= 0 (

*q*= 1, no chance of a cloud),

*m*

_{c}= 1, which is the minimal possible size in the discrete model. Indeed, in the discrete model as

*p*tends to 0, it becomes increasingly difficult to find a cloud, but if we succeed in finding one, it cannot be smaller than the cell size. In the overcast case

*p*= 1 (

*q*= 0, no chance of clearing),

*m*

_{c}=

*N*, which is the full length of the sample.

### c. Cloud cover

*k*is the number of all cloudy cells (not necessarily consecutive) in a sample of size

*N*. As follows from combinatorics, the probability of having

*k*occupied cells in a sample obeys the binomial distribution

*c*

*p*.

## 3. Continuous limit

*l*into

*M*subcells of the size

*p*we need to assume that the occupation probability of a subcell is

*x*corresponding to

*m*subcells as

_{s}*dx*for the continuous probability density is

### a. Infinite lattice

*M*,

*m*→ ∞, while preserving the value of

_{s}*x*. Taking into account that

*e*≈ (1 +

^{y}*y*) when

*y*= 1/

*M*≪ 1, we find that

*f*(

_{c}*x*):

*p*and

*l*are uniquely determined by the observed values of

*L*and

_{c}*L*. Indeed,

_{g}*p*can be found as a (numerical) solution of

*l*can be determined as

*L*and

_{c}*L*provide an alternative (but equivalent) parameterization of the model, and in some approaches (e.g., Astin and Di Girolamo 1999) the use of

_{g}*p*and

*l*, as well as the whole “cellular” concept, is avoided.

### b. Finite-length samples

*x*<

*L*, while the overcast contribution does not vanish with

*dx*→ 0, becoming

*dx*means that the complete length distribution of clouds has a singularity at

*x*=

*L*:

*F*, rather than

_{c}*f*, is normalized:

_{c}*x*

_{c}→

*L*as

_{c}*L*→ ∞, while for finite

*L*we always have

*x*

_{c}<

*L*; in particular, for a very small sample (

_{c}*L*≪

*L*,

_{c}*a*≪ 1) we expect

_{c}*x*

_{c}≈

*L*.

Technically, small *l*/*M* = *dx* in Eq. (24) comes from the factor (1 − *p _{s}*) ≪ 1 in Eq. (22), which is absent in the expression for overcast probability; however, the singularity of the cloud length distribution can also be explained by the independence of all-clear and overcast statistics on the size of the histogram bin. Note also that in the case of a single cell box (

*N*= 1,

*L*=

*l*) the value of

*f*=

_{c}^{o}*p*/(1 − ln

*p*) is not equal to 1 (unless

*p*= 1), as it would be in the discrete case. In fact, it is always less than 1, allowing for a continuous population

*f*of subcell clouds.

_{c}The corresponding gap distribution statistics can be obtained from the above formulas by replacing *L _{c}* and

*a*with

_{c}*L*and

_{g}*a*.

_{g}### c. Cloud cover

*I*

_{0}and

*I*

_{1}are modified Bessel functions of the argument

*N*= 1 the all-clear and overcast probabilities in discrete model add up to unity, while in the continuous case they allow for subcell clouds.

*L*:

*p*from the observed cloud fraction. The plot of

*c*

*p*is shown in Fig. 1. We see that

*c*

*p*, unless

*p*= ½, 0, or 1: for

*p*< ½ cloud cover is always smaller than

*p*, and vice versa for

*p*> ½.

The distribution (C77) with the density (C78) and end components (C80) [leading to Eqs. (40), (42), (44), and (45)] has been also derived by Astin and Di Girolamo (1999) using an approach inspired by queue theory (Takacs 1957; Conolly 1971); however, they did not compute the mean in the finite sample case, and thus the initial probabilities *u* and *υ* = 1 − *u* remained undetermined. It also was not clearly spelled out in that study that the end values (C80) are the coefficients at the corresponding *δ* functions [as in Eq. (40)].

*L*is considerably larger than the mean sizes of clouds and gaps (

*L*≫

*L*,

_{c}*L*), the width of the distribution density (42) becomes very narrow, while both completely clear sky and overcast cases become improbable. Thus, in the limit case of an infinite sample (

_{g}*L*= ∞, assuming that

*p*≠ 0 and

*p*≠ 1, so that

*a*,

_{c}*a*→ ∞) we have

_{g}## 4. Construction of examples

Generation of an example of a 1D or 2D cloud field according to a discrete model is quite straightforward since the cells are filled independently one from each other. Thus, we do not show 1D examples from the discrete model, while a 2D example is presented (Fig. 8, top left). In a continuous model the situation is different since the subcells are no longer independent; that is, the filling probability of a subcell is conditioned on the clear/cloudy statuses of its neighbors.

### a. 1D case

*p*and

*q*for the whole cells. As

*M*→ ∞,

*p*,

_{s}*q*→ 1. Note, that

_{s}*p*+

_{s}*q*≠ 1, since in distinction from

_{s}*p*and

*q*in the original (discrete) cellular model,

*p*and

_{s}*q*are conditional probabilities:

_{s}*p*(

_{s}*q*) is the probability of a subcell to be cloudy (clear) given that the previous subcell is cloudy (clear). Thus, to construct a realization of a continuous model (which is not strictly speaking “continuous,” since in such an example

_{s}*M*is finite) we move through the sample lattice from left to right using the following rule: if the

*i*th subcell is cloudy we use

*p*for cloudy and 1 −

_{s}*p*for clear to find the status of the (

_{s}*i*+ 1)th subcell, and conversely, if the

*i*th subcell is clear we use probability

*q*for clear and 1 −

_{s}*q*for cloudy. Obviously, the switch between cloud and clear sky appears when the (

_{s}*i*+ 1)th subcell status differs from that of the

*i*th subcell. The procedure continues until the end of the sample is reached. The probability of the first subcell to be cloudy should be the same as of any other cell in the sample and thus equal to the mean cloud cover (47). The first subcell status can be selected directly according to this probability. Another way to do this (which was actually used to create the examples described below) is to create a longer sample (e.g., 3 times longer than required), with an arbitrary selection of the first subcell, and then take the last third of it as our example. This procedure is equivalent to the direct status selection, since Eq. (47) also gives the mean cloud cover of a long sample with an arbitrary first subcell status (the sample statistics “forget” that status after a certain length). We should mention that while the probability of transition from cloudy to clear is different from the probability of transition from clear to cloudy (if

*p*≠ ½), the chord statistics do not depend on the direction of filling the sample (Sánchez et al. 1994).

A set of examples generated according to this procedure is presented in Figs. 2 –7. The plots are generated for *p* = 0.5, 0.25, and 0.1 and *L* = 15*l* (Figs. 2 –4, subdivision number *M* = 20) and *L* = *l* (Figs. 5 –7, subdivision number *M* = 200). Although, strictly speaking, one needs an infinite number of samples in order to use statistics for such a model, in practice we have to use a finite number of simulations. This number should be sufficiently large to closely approximate the “true” statistical distributions (represented by the theoretical curves in our case). We used 5000 samples in each of our simulations. The first 50 of them are shown in the top left panel of the corresponding figure, while the histograms in the other three panels are based on the whole 5000-sample datasets. The cloud cover histograms are in the top right panels with the solid lines depicting the theoretical probability density (42). The theoretical mean cloud covers are computed using Eq. (47), while all-clear and overcast fractions are computed using Eqs. (44) and (45). The bottom left panels show cloud length histograms with the theoretical densities (33) depicted by solid lines, while the dashed lines correspond to the infinite-sample density (26). The difference between the finite and infinite-sample densities is better seen in the case of short sample (Figs. 5 –7), where the former shows better agreement with the simulations. The theoretical all-clear and overcast fractions are derived from Eq. (34). Expectedly, for short samples these fractions are relatively large. For example, for *p* = 0.5 they both are 25% in cloud cover for *N* = 1 and virtually zero for *N* = 15. Thus, in the former case 50% of samples have fractional cloud cover (note that in the discrete model such samples are not allowed). The theoretical mean cloud lengths are computed using Eq. (38), while the values of *L _{c}* are determined from Eq. (27). The bottom right panels provide analogous descriptions of the gap length distributions. We see from these plots that the simulation histograms constructed using the “long sample” method described above agree with the theoretical curves. This agreement indicates that the theoretically derived first subcell status probabilities are correct, and also that our simulations are not biased. Examples for

*p*> ½ (not shown) are similar to those shown: one just needs to interchange clouds and gaps (and, therefore,

*p*and

*q*) and the cloud cover distribution histogram flips around the vertical axis at

*c*= 0.5.

### b. 2D case

*l*randomly filled with cloud (with probability

*p*). The cloud (chord) length statistics are the same as in 1D case, since each row or column can be considered as a 1D sample with the same

*l*and

*p*. Construction of subgrid examples in 2D is more complicated than in 1D because there is no clear direction for filling the sample. In the 2D case an unfilled subcell may have already filled neighbors with different statuses, making the decision of its own status ambiguous. While the correct way to fill a 2D sample may well exist, we have not found it yet. Thus, instead of a filling procedure, we use an ad hoc iterative “mixing” algorithm to construct 2D broken cloud fields. This algorithm follows the general ideology of a continuous cell model; however, agreement of the statistics of its output with those prescribed by the model still remains to be checked (this requires extensive simulations). We start by creating a discrete model realization described above on the

*N*×

*N*grid with spacing

*l*. Then, we divide each cell into

*M*

^{2}subcells, creating the initial

*NM*×

*NM*state for our iteration process. As in the 1D case, we define the subgrid filling probabilities

*i*th iteration step

*n*of these neighbors are cloudy and (8 −

*n*) are clear, then the probability of our cell being cloudy on the next (

*i*+ 1)th step is

*i*th iteration step a subcell had seven or eight cloudy (clear) neighbors, it will be cloudy (clear) on the next step (with probability 1). If

*p*< ½ (

*p*> ½) this iteration procedure leads to monotonic decrease (increase) in cloud cover. Thus, to obtain an example consistent with our continuous model, the iterations should be stopped when the cloud cover comes close to the value prescribed by Eq. (48). To be precise, the ensemble average of cloud cover over a number of samples constructed in the same way (with the same number of iterations) should agree with the value from Eq. (48). However, if only one example needs to be constructed, reaching this value can be used as a signal to stop the iterative process.

An example of the above described construction of a 2D sample cloud field is shown in Fig. 8. For this sample we took *p* = 0.24 and a 15 × 15-cell grid. Each of the cells was divided into *M* × *M* subcells with *M* = 20. The cloud cover 0.276 of the initial discrete-model simulation (top left) is close to *p*, while during the mixing process it drops to 0.16 after 180 iterations (bottom right). We stop the iterations at this point, since this cloud cover value corresponds to our *p* according to Eq. (48). We see that the resemblance to the initial state gradually decreases during the mixing iterations. The final cloud field looks more realistic than the initial sample, since the clouds now have more detailed structure rather than being clusters of a few square cells.

## 5. Relationship to the Ising model

The 2D Ising model is widely studied both analytically and numerically with applications in phase transitions and critical phenomena. The classical approach to Monte Carlo simulations of this and other statistical lattice models is the Metropolis “importance sampling” algorithm (Metropolis et al. 1953). In this classical approach the states of the sample are generated recursively one from the other with some transition probability. The difference between the neighboring states is small: usually a single spin flip at a random location. Recently more computationally efficient methods have been developed—for example, cluster flip algorithms (Swendsen and Wang 1987; Wolff 1989), in which large regions of spins are flipped instead of single spins.

Our above described method for construction of a “realistic” 2D broken cloud field, while showing some resemblance to Ising model simulations, is different from the Metropolis algorithm and its relatives. While our method is also an iterative procedure, only its final state is considered as a representation of the continuous cloud model. Another representation should be generated independently. Our method also cannot be used for simulation of phase transitions, since we directly prescribe on the first step the positions of what would later become “domains” rather than letting them form naturally during a simulation.

*P*is the probability of transition from state

_{ij}*i*into state

*j*. Both

*i*and

*j*can take two values: a plus sign = +1 (cloudy) or a negative sign = −1 (clear). Each row of 𝗣 sums to unity.

*σ*= {

*σ*

_{1},

*σ*

_{2}, … ,

*σ*} of

_{n}*n*random variables (spins)

*σ*, each of which can take values of ±1 (indicative of cloudy or clear subcell in our model). The probability of the system being in a particular state

_{i}*σ*is

*E*(

*σ*) is the state’s energy,

*Z*is the partition function,

*T*is the system’s temperature,

*k*is Boltzmann’s constant,

*H*is the magnetic field, and

*J*is the coupling constant. Here we do not discuss boundary conditions, which may alter the summation in Eq. (56). It can be shown (cf. Baxter 1982) that

*P*(

*σ*) can be represented as a product of probabilities involving only neighboring spins, which have the form

*a*and

*b*are constants necessary for consistency with the definition of transition probability. Indeed,

*P*can be considered as the probability of transition from the state of the

_{j}*j*th spin to a state of its right neighbor. This means that the values of

*P*form the transition matrix of a binary Markov process, which has the form Eq. (54):

_{j}*p*and

_{s}*q*(or equivalently,

_{s}*p*and

*M*) can be determined from the equality of 𝗣 and 𝗣′:

*p*=

*q*= ½ both

*h*and

*a*vanish. Recalling the relationship in Eq. (51) of

*p*and

_{s}*q*to

_{s}*p*and

*M*, we can see that in the continuous limit as

*M*→ ∞ the coupling constant

*K*grows logarithmically with

*M*, while

*h*decreases as

*M*

^{−1}.

A relationship, similar to that described above, between the parameters of our cellular cloud model and those of the Ising model is likely to hold in the 2D case. This opens the possibility of considering an extensive arsenal of simulation methods and analytical techniques developed for the 2D Ising model for construction of broken cloud field samples and to study their statistical properties, which we leave for our future studies.

## 6. Discussion

We next discuss the differences between our approach and that described in the series of papers by Astin et al. (2001), van de Poll et al. (2006), and Settle and van de Poll (2007), which are also devoted to the subject of cloud transect statistics. In these studies the authors solve an inverse statistical problem: they use a fixed observed sample (or a relatively small set of samples) and use Bayesian statistics to estimate the underlying cloud model parameters, such as the cloud and gap exponents (equivalent to *L _{c}* and

*L*in our terminology) and the true (infinite space) cloud fraction

_{g}*c*

_{∞}=

*L*/(

_{c}*L*+

_{c}*L*). This estimation results in statistical distributions of the model parameters. These estimates, therefore, depend on the particular structure of the observed sample, such as the numbers of clouds and gaps, and when these numbers become large, the probability density of, say,

_{g}*c*

_{∞}becomes narrowly peaked. Entirely clear and overcast samples require special consideration in this approach. Conversely, our study is focused on a forward problem, which means that we start with fixed model parameters (such as

*L*and

_{c}*L*) and study statistical properties of the ensemble of observations, which consists of all possible realizations of the model (of course, we have to use a limited but large subset of this ensemble in numerical simulations). Thus, in our approach we do not look for, for example, a distribution of

_{g}*c*

_{∞}conditioned by the observed sample (since

*c*

_{∞}is predetermined by the fixed model parameters); instead, we derive the distribution of the cloud cover values observed in the samples.

We also emphasize that in our computations of statistics for finite samples we do not imply that the sky outside the sample is clear (or cloudy). On the contrary, we assume that our finite sample is taken at random from a large, possibly infinite, statistically homogeneous sample. This means that the cloudy (clear) intervals at the ends of a sample may be (and most likely are) parts of clouds (gaps), rather than whole clouds (gaps) in that large sample. An overcast (entirely clear) condition appears when a sample is taken from within a cloud (gap) that is larger than its size. The partial clouds and gaps are treated in our approach on an equal footing with the whole ones. Note that the computations (presented in appendix A) of cloud and gap length distributions for a finite sample [Eqs. (6) and (7)] are made in the framework of the discrete model, where cells are independent; thus, the states of the cells outside the sample do not influence this distribution or its continuous limit [Eqs. (33)–(36)]. In the computation of the cloud cover distribution [Eqs. (40)–(45)], the choice of initial probabilities is made based on the requirement of homogeneity of the sample (the probability that the first subcell is cloudy is the same as for any other subcell in the sample, i.e., equal to the mean cloud cover; see appendixes C and D for details). If we had instead postulated clear conditions outside the sample (i.e., if the subcell to the left of the first subcell in the sample was assumed to be always clear), the initial probability would be *u* = *q _{s}*. This probability is close to unity for a large enough

*M*; thus, the “clear outside” condition would create a bias toward clear between the beginning of the sample and the rest of it, thereby making it statistically inhomogeneous. The homogeneity condition appeared to be equivalent to the assumption that samples are picked from an infinite field, since it has been demonstrated that the mean cloud cover is the same in finite and infinite cases. Our analytical results are consistent with the statistics of our simulated examples, for the construction of which the samples were taken from the ends of sufficiently long records. Finally, all expressions derived in this study are invariant under simultaneous interchange of

*p*↔

*q*and cloud ↔ gap. This invariance contradicts any assumption of a fixed (e.g., clear) condition outside of the sample, since such a transformation would change this condition to its opposite.

## 7. Concluding remarks

We have introduced two types of cellular models providing statistical descriptions of broken cloud-field properties. Computations were performed in 1D to derive statistical distributions of clouds and gaps between them by length, as well as of the cloud cover. The resulting formulas also describe cloud field properties measured along 1D transects in a 2D case. The first (discrete) model describes statistics of cloud fields on a discrete grid, each cell of which can be filled with cloud with a certain prescribed probability *p* (which does not depend on the state of the other cells in the sample). This basic model is too simplistic for adequate description of realistic cloud fields, since cloud-field parameters can take only discrete values within its framework. The second (continuous) model is an extension of the discrete model, and allows for continuous distributions of cloud/gap lengths and cloud fraction. We show in Part II of this paper (Alexandrov et al. 2010, hereafter Part II) that this model is in agreement with statistics of large-eddy cloud-field simulations. Note that the continuous model is not equivalent to a discrete model just with a larger number of smaller cells: the latter would produce rather homogeneous samples with no clumpy clouds, such as seen in Figs. 2 –8. In both discrete and continuous models we assume that the observed cloud/gap statistics are determined from samples of finite size. This means, in particular, that the samples within a cloud (gap), which is larger than the sample size, are interpreted as overcast (entirely clear) and are assigned cloud (gap) lengths equal to the sample size and the cloud fraction of 1 (0). In the continuous model such samples contribute to the singularities (*δ* functions) in the length distributions (at the length equal to the sample size) and in the cloud cover distribution (at the fractions of 0 and 1).

If *p* does not change across the statistical ensemble, the cloud/gap lengths distributions in our continuous model are essentially exponential. The opposite “diverse ensemble” situation with power-law clouds and gaps is described in Part II of this paper. We should note that according to Eqs. (27) and (29) the set of cellular model parameters *p* and *l* is in a one-to-one correspondence with the set of the exponents (*L _{c}*

^{−1},

*L*

_{g}^{−1}) of the cloud and gap length distributions. This means that in the 1D case some of our results, such as the cloud cover distribution [Eqs. (40)–(45)], can also be obtained using the approach of Astin and Di Girolamo (1999). In this approach the exponential cloud/gap lengths distributions are postulated, and the samples are constructed by sequential alternating drawing of clouds and gaps from their respective distributions (without invocation of a cellular structure). The situation, however, becomes quite different when we attempt to construct a cloud field in the 2D case, where we lack the concept of a filling direction necessary for a sequential method described above. We currently do not have a rigorous continuous cellular model in 2D case; however, our preliminary analysis (section 4b; Fig. 8) shows that realistic examples of cloud fields in this case can be constructed through an iterative process starting at a realization of the discrete model. The cellular structure is essential for such a procedure.

In Part II of this paper we present quantitative comparisons demonstrating that our model is generally in good agreement with the statistics of the cloud fields obtained using large-eddy simulations. We will also describe statistical properties of diverse cloud datasets obtained from observations with wide temporal and/or geographical coverage. Our future plans include development of a 2D analog of the continuous model described in this paper, as well as extension of the presented techniques from binary (cloud/clear) datasets to those with continuous values (such as LWP or cloud optical thickness). The latter will be achieved through consideration of multilayer binary models, where the statistics in each layer depend on the states of the layers below.

## Acknowledgments

This research was supported in part by the NASA Glory Project. The authors thank E. Kassianov, A. Korolev, A. Del Genio, B. Cairns, and A. Lacis for useful discussions, and also A. S. Schwarz for pointing our attention at the similarity between our continuous model and the 1D Ising model. We are also indebted to the reviewers of this paper for their detailed comments, which helped us to improve our work.

## REFERENCES

Aida, M., 1977: Scatter of solar radiation as a function of cloud dimensions and orientations.

,*J. Quant. Spectrosc. Radiat. Transfer***17****,**303–310.Alexandrov, M. D., A. Marshak, B. Cairns, A. A. Lacis, and B. E. Carlson, 2004: Automated cloud screening algorithm for MFRSR data.

,*Geophys. Res. Lett.***31****,**L04118. doi:10.1029/2003GL019105.Alexandrov, M. D., A. S. Ackerman, and A. Marshak, 2010: Cellular statistical models of broken cloud fields. Part II: Comparison with a dynamical model and statistics of diverse ensembles.

,*J. Atmos. Sci.***67****,**2152–2170.Astin, I., and B. G. Latter, 1998: A case for exponential cloud fields?

,*J. Appl. Meteor.***37****,**1375–1382.Astin, I., and L. Di Girolamo, 1999: A general formalism for the distribution of the total length of a geophysical parameter along a finite transect.

,*IEEE Trans. Geosci. Remote Sens.***37****,**508–512.Astin, I., L. Di Girolamo, and H. M. van de Poll, 2001: Bayesian confidence intervals for true fractional coverage from finite transect measurements: Implications for cloud studies from space.

,*J. Geophys. Res.***106****,**17303–17310.Baxter, R. J., 1982:

*Exactly Solved Models in Statistical Mechanics*. Academic Press, 498 pp.Berg, L. K., and E. I. Kassianov, 2008: Temporal variability of fair-weather cumulus statistics at the ACRF SGP site.

,*J. Climate***21****,**3344–3358.Breon, F-M., 1992: Reflectance of broken cloud fields: Simulation and parameterization.

,*J. Atmos. Sci.***49****,**1221–1232.Byrne, N., 2005: 3D radiative transfer in stochastic media.

*Three-Dimensional Radiative Transfer in Cloudy Atmospheres,*A. Marshak and A. B. Davis, Eds., Springer, 385–424.Cahalan, R. F., 1994: Bounded cascade clouds: Albedo and effective thickness.

,*Nonlinear Processes Geophys.***1****,**156–167.Cahalan, R. F., and J. H. Joseph, 1989: Fractal statistics of cloud fields.

,*Mon. Wea. Rev.***117****,**261–272.Conolly, B. W., 1971: On randomized random walks.

,*SIAM Rev.***13****,**81–99.Ellingson, R. G., 1982: On the effects of cumulus dimensions on longwave irradiance and heating rate calculations.

,*J. Atmos. Sci.***39****,**886–896.Evans, K. F., and W. J. Wiscombe, 2004: An algorithm for generating stochastic cloud fields from radar profile statistics.

,*Atmos. Res.***72****,**263–289.Gradshteyn, I. S., and I. M. Ryzhik, 1965:

*Table of Integrals, Series, and Products*. Academic Press, 1130 pp.Harrison, L., J. Michalsky, and J. Berndt, 1994: Automated multifilter rotating shadow-band radiometer: An instrument for optical depth and radiation measurements.

,*Appl. Opt.***33****,**5118–5125.Hogan, R. J., and S. F. Kew, 2005: A 3D stochastic cloud model for investigating the radiative properties of inhomogeneous cirrus clouds.

,*Quart. J. Roy. Meteor. Soc.***131****,**2585–2608.Holben, B. N., and Coauthors, 1998: Aeronet—A federated instrument network and data archive for aerosol characterization.

,*Remote Sens. Environ.***66****,**1–16.Isichenko, M. B., 1992: Percolation, statistical topography, and transport in random media.

,*Rev. Mod. Phys.***64****,**961–1043.Joseph, J. H., and R. F. Cahalan, 1990: Nearest neighbor spacing of fair weather cumulus clouds.

,*J. Appl. Meteor.***29****,**793–805.Kassianov, E., 2003: Stochastic radiative transfer in multilayer broken clouds. Part I: Markovian approach.

,*J. Quant. Spectrosc. Radiat. Transfer***77****,**373–394.Kite, A., 1987: The albedo of broken cloud fields.

,*Quart. J. Roy. Meteor. Soc.***113****,**517–531.Koren, I., L. Oreopoulos, G. Feingold, L. A. Remer, and O. Altaratz, 2008: How small is a small cloud?

,*Atmos. Chem. Phys.***8****,**3855–3864.Lane, D. E., K. Goris, and R. C. J. Somerville, 2002: Radiative transfer through broken clouds: Observations and model validation.

,*J. Climate***15****,**2921–2933.Lane-Veron, D. E., and R. C. J. Somerville, 2004: Stochastic theory of radiative transfer through generalized cloud field.

,*J. Geophys. Res.***109****,**D18113. doi:10.1029/2004JD004524.Malvagi, F., R. N. Byrne, G. Pomraning, and R. C. J. Somerville, 1993: Stochastic radiative transfer in a partially cloudy atmosphere.

,*J. Atmos. Sci.***50****,**2146–2158.Marshak, A., A. Davis, R. F. Cahalan, and W. J. Wiscombe, 1994: Bounded cascade models as nonstationary multifractals.

,*Phys. Rev. E***49****,**55–69.Metropolis, N., A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, 1953: Equation of state calculations by fast computing machines.

,*J. Chem. Phys.***21****,**1087–1092.Nagel, K., and E. Raschke, 1992: Self-organizing criticality in cloud formation?

,*Physica A***182****,**519–531.Nicholls, S., 1989: The structure of radiatively driven convection in stratocumulus.

,*Quart. J. Roy. Meteor. Soc.***115****,**487–511.Plank, V. G., 1969: The size distribution of cumulus clouds in representative Florida populations.

,*J. Appl. Meteor.***8****,**46–67.Prigarin, S. M., and A. Marshak, 2009: A simple stochastic model for generating broken cloud optical depth and cloud-top height fields.

,*J. Atmos. Sci.***66****,**92–104.Prigarin, S. M., T. B. Zhuravleva, and P. B. Volikova, 2002: Poisson model of broken multilayer cloudiness.

,*Atmos. Ocean. Opt.***15****,**947–954.Prudnikov, A. P., Y. A. Brychkov, and O. I. Marichev, 1986:

*Elementary Functions*. Vol. 1,*Integrals and Series,*Gordon and Breach, 806 pp.Prudnikov, A. P., Y. A. Brychkov, and O. I. Marichev, 2002:

*Special Functions*. Vol. 2,*Integrals and Series,*Taylor and Francis, 758 pp.Ramirez, J. A., and R. L. Bras, 1990: Clustered or regular cumulus cloud fields: The statistical character of observed and simulated cloud fields.

,*J. Geophys. Res.***95****,**2035–2045.Randall, D. A., and G. J. Huffman, 1980: A stochastic model of cumulus clumping.

,*J. Atmos. Sci.***37****,**2068–2078.Rodts, S. M. A., P. G. Duynkerke, and H. J. J. Jonker, 2003: Size distributions and dynamical properties of shallow cumulus clouds from aircraft observations and satellite data.

,*J. Atmos. Sci.***60****,**1895–1912.Sánchez, A., T. F. Smith, and W. F. Krajewski, 1994: A three-dimensional atmospheric radiative transfer model based on the discrete-ordinates method.

,*Atmos. Res.***33****,**283–308.Sauvageot, H., F. Mesnard, and R. S. Tenorio, 1999: The relation between the area-average rain rate and the rain cell size distribution parameters.

,*J. Atmos. Sci.***56****,**57–70.Schertzer, D., and S. Lovejoy, 1987: Physical modeling and analysis of rain and clouds by anisotropic scaling multiplicative processes.

,*J. Geophys. Res.***92****,**9693–9714.Schmidt, K. S., V. Venema, F. D. Giuseppe, R. Scheirer, M. Wendisch, and P. Pilewskie, 2007: Reproducing cloud microphysics and irradiance measurements using three 3D cloud generators.

,*Quart. J. Roy. Meteor. Soc.***133****,**765–780.Settle, J. J., and H. M. van de Poll, 2007: On the Bayesian estimation of cloud fraction from lidar transects.

,*J. Geophys. Res.***112****,**D09211. doi:10.1029/2006JD007251.Smirnov, A., B. N. Holben, T. F. Eck, O. Dubovik, and I. Slutsker, 2000: Cloud screening and quality control algorithms for the AERONET database.

,*Remote Sens. Environ.***73****,**337–349.Su, B. J., and G. C. Pomraning, 1994: A stochastic description of a broken cloud field.

,*J. Atmos. Sci.***51****,**1969–1977.Swendsen, R. H., and J-S. Wang, 1987: Nonuniversal critical dynamics in Monte Carlo simulations.

,*Phys. Rev. Lett.***58****,**86–88.Takacs, L., 1957: On certain sojourn time problems in the theory of stochastic processes.

,*Acta Math. Acad. Sci. Hung.***8****,**169–191.Titov, G. A., 1990: Statistical description of radiation transfer in clouds.

,*J. Atmos. Sci.***47****,**24–38.Van de Poll, H. M., H. Grubb, and I. Astin, 2006: Uncertainty properties of cloud fraction estimates from random transect observations.

,*J. Geophys. Res.***111****,**D22218. doi:10.1029/2006JD007189.Venema, V., S. Bachner, H. W. Rust, and C. Simmer, 2006: Statistical characteristics of surrogate data based on geophysical measurements.

,*Nonlinear Processes Geophys.***13****,**449–466.Wielicki, B. A., and R. M. Welch, 1986: Cumulus cloud properties derived using Landsat satellite data.

,*J. Climate Appl. Meteor.***25****,**261–276.Wolff, U., 1989: Collective Monte Carlo updating for spin systems.

,*Phys. Rev. Lett.***62****,**361–364.Yau, M. K., and R. R. Rogers, 1984: An inversion problem on inferring the size distribution of precipitation areas from raingage measurements.

,*J. Atmos. Sci.***41****,**439–447.Zuev, V. E., and G. A. Titov, 1995: Radiative transfer in cloud fields with random geometry.

,*J. Atmos. Sci.***52****,**176–190.

## APPENDIX A

### Discrete Cloud Length Distribution for Finite Samples

*N*cells, we compute statistical sums, to which cloud field realizations contribute with the weights equal to their probabilities. For each cloud length

*m*<

*N*we consider two cases: when the cloud under consideration is at the end of the observation interval

*i*= 2, … ,

*N*−

*m*and the cells occupied by clouds are depicted by bullets, while the clear cells are shown by open circles. In the first case each realization contributes to the statistical sum with the weight

*p*× weight of the last (

^{m}q*N*−

*m*− 1) cells. The sum of the weights of the realizations of the last (

*N*−

*m*− 1) cells is 1, since we sum up the probabilities of all possible cases. Thus, the contribution of the end-interval case (A1) is just

*p*, and there are two such cases (at each end of the interval). Similarly, the contribution of a midinterval case (A2) is

^{m}q*p*

^{m}q^{2}, and we have (

*N*−

*m*− 1) cases like this. Thus, the statistical sum for the cloud of the size

*m*<

*N*is

*m*=

*N*)

*S*,

_{m}*m*= 1, … ,

*N*:

*p*. Thus, the cloud size distribution takes the form

*m*<

*N*and

*m*=

*N*. By construction,

*n*is normalized by the condition

_{c}*p*) of the finite geometric series formula, and, after some algebraic transformations, obtain the following simple expression for the cloud mean length:

## APPENDIX B

### Cloud Length Statistics in the Continuous Model with Finite Samples

#### Normalization condition

*x*<

*L*, and

*x*=

*L*) contribution. Thus, the integral (B2) takes the form

*t*=

*x*/

*L*transforms into

_{c}#### Computation of the mean

*t*=

*x*/

*L*(and using

_{c}*L*=

*a*in the last term) becomes

_{c}L_{c}## APPENDIX C

### Cloud Cover Distribution in the Continuous Model

*N*-cell box we subdivide each cell into

*M*subcells (we will take the limit

*M*→ ∞ to obtain the continuous case). Each cell has size

*l*, so the subcell size is

*l*/

*M*. The total number of subcells and the total length of the sample are, respectively,

*p*and

_{s}*q*according to Eq. (51) (note that

_{s}*p*,

_{s}*q*→ 1 as

_{s}*M*→ ∞). We remind the reader that to construct a realization of the model, we move through our sample from left to right using the following rule: if the

*i*th subcell is cloudy we use a draw with probability

*p*for cloudy and 1 −

_{s}*p*for clear to find the status of the (

_{s}*i*+ 1)th subcell, and conversely, if the

*i*th subcell is clear we use a draw with probability

*q*for clear and 1 −

_{s}*q*for cloudy. The procedure continues until the end of the box is reached. In general, the status of the first cell is selected by a draw with an initial probability

_{s}*u*to be cloudy (we also define

*υ*= 1 −

*u*), which is a free parameter of the model influencing the structure of the sample. To determine this initial probability we impose the following homogeneity condition: the probability of the first cell to be cloudy is the same as of any other cell in the sample (thus, equal to the mean cloud cover).

#### Computation of the statistical sum

*k*cloudy subcells out of

*n*subcells in the box (

*c*=

*k*/

*n*), we compute the statistical sum as expansion over the number

*r*= 1, 2, 3, … of “quasi-cells” defined as interchanging clear and cloudy intervals of arbitrary length. For example, for

*r*= 1 we have two situations: completely overcast

*M*and therefore will contribute to the singular part of the probability density. For

*r*= 2 we also have two cases:

*r*= 3:

*r*= 2

*i*the two cases are

*r*= 2

*i*+ 1 they are

To construct the statistical sum we assign weights *p _{s}* to all cloudy subcells, except for those where the switch from clear to cloudy happened (first subcells of cloudy intervals), which are assigned with the weight (1 −

*q*). Similarly, the first subcells of clear intervals are assigned the weight (1 −

_{s}*p*), while all other clear subcells will enter the sum with weight

_{s}*q*. The first subcell of the whole sample is assigned the initial probability weight

_{s}*u*if it is cloudy and

*υ*= 1 −

*u*if it is clear (these weights are free parameters to be determined from additional constraints described further below).

*k*and (

*n*−

*k*), respectively], while the lengths of particular clear and cloudy intervals may vary. Thus, the weight of each diagram is multiplied by the number of ways the cloudy and clear subcells (only those that are not switch points) can be placed into the respective sets of intervals. To compute this number we solve a simple combinatorial problem about the number of ways to put

*m*identical balls into

*n*numbered boxes. This problem can be solved by treating the balls and the walls separating the boxes [(

*n*− 1) of them] on equal footing (e.g., considering the walls as balls of a different color), and the solution is

*r*= 2

*i*. For this diagram we have (

*i*− 1) cloudy switch subcells (switching from clear to cloud) and

*i*clear switch subcells (switching from cloud to clear). We also have

*k*−

*i*+ 1 free (nonswitch) cloudy subcells that should be distributed between the

*i*intervals and (

*n*−

*k*−

*i*) free clear subcells also to be put into

*i*intervals. Thus the weight of the first diagram will be

*i*cloudy and

*i*clear intervals,

*i*cloudy and (

*i*− 1) clear switch subcells, and (

*k*−

*i*) cloudy and (

*n*−

*k*−

*i*+ 1) clear free subcells. Thus, the weight of this diagram is

*i*+ 1) cloudy and

*i*clear intervals,

*i*cloudy and

*i*clear switch subcells, and (

*k*−

*i*) cloudy and (

*n*−

*k*−

*i*) clear free subcells. This leads to the weight

*i*cloudy and

*i*+ 1 clear intervals, and, as well as the first odd diagram,

*i*cloudy and

*i*clear switch subcells, and (

*k*−

*i*) cloudy and (

*n*−

*k*−

*i*) clear free subcells. Thus, its weight is

*i*running from 1 (we consider the all-clear/overcast case

*r*= 1 separately) to min(

*k*,

*n*−

*k*). The upper limit goes to infinity with the cell subdivision number

*M*; thus, we can replace the finite sums by the infinite series. Let us first compute the coefficients

*K*in the limit of

*k*, (

*n*−

*k*) → ∞. To avoid repetition of computations similar for the coefficients

*K*in Eqs. (C13)–(C16) we denote by

*m*the number going to infinity in the continuous limit [

*k*or (

*n*−

*k*)], and by

*j*the number that remains finite (combinations of ±

*i*and ±1). Then, the majority of the coefficients

*K*above have the form

*i*→ (

*i*− 1) and introduced the notation

*z*of Eqs. (C29) and (C30) is

*M*→ ∞

*z*:

*l*is the cell size,

*c*=

*k*/

*n*=

*x*/

*L*is the cloud cover. Using Eqs. (C33)–(C37), and also noticing that

*z*is defined by Eq. (C37). Thus, the expression for the total density can be written in the form

#### Normalization condition

*S*(

*c*) by construction should satisfy the normalization condition

*w*

_{0}and

*w*

_{1}are the respective weights (C5) and (C3) of all-clear and overcast samples, which in the notation of Eq. (C36) have the form

*S*(

*c*) between 0 and 1. To do this it is convenient to split the integral into three parts:

*c*∈ [0, ½] and back from 1 to 0 when

*c*∈ [½, 1]. Thus,

*c*can be expressed through

*t*as

*c*< ½, and the plus sign is used when

*c*> ½, and the integral over

*t*includes both terms. We also have

_{1}, and

*ν*= 1,

*α*= 1) this expression can be simplified using the relation [cf. Gradshteyn and Ryzhik (1965)]

_{3}takes the following form:

*p*= ½ when

*a*=

_{c}*a*≡

_{g}*a*, it looks even simpler:

_{2b}we notice that the integral (C66) can be written in the form

*β*yields the following formula:

_{2b}using the substitution (C67):

_{0}of a general case. Thus, we guess the form of the solution and then verify the result numerically. As a basis for our guess we take the solution in the specific case of

*p*= ½ (

*a*=

_{c}*a*≡

_{g}*a*), which can be found using another table integral (Prudnikov et al. 2002):

*p*may have the following form:

*u*and

*υ*.

#### General form of the singular density

*Z*is defined as

## APPENDIX D

### Mean Cloud Cover in the Continuous Model

*c*is expressed according to Eq. (C53); as for the normalization, we express the integral in Eq. (D1) as a sum of three terms:

_{2}as

_{2a}is similar to

_{1}, so the two can be combined into

_{0}is defined by Eqs. (C59) and (C74), while

_{3}is defined by Eqs. (C63) and (C68), while

_{2b}is given by Eqs. (C62) and (C72). Here

_{2b}takes the following form:

_{0}we need to extend [reversing the substitution (C67)] our guess (C74) to the following relation:

*β*, we come to the following:

_{0}as

_{2b}we again use Eq. (D13). Differentiating Eq. (D13) with respect to

*γ*and noticing that (cf. Gradshteyn and Ryzhik 1965)

*β*↔

*γ*interchange, so the result of Eq. (D17) can be also obtained from Eq. (D14) simply by interchanging these variables. The expression for

_{2b}now takes the form

*p*= 0 (no cloud)

*L*= 0 and

_{c}*a*= ∞, while

_{c}*L*= ∞ and

_{g}*a*= 0, and it is easy to see that

_{g}*c*

*p*= 1 (completely overcast)

*L*= ∞ and

_{c}*a*= 0, while

_{c}*L*= 0 and

_{g}*a*= ∞, so

_{g}*c*

#### Infinite sample

*L*→ ∞ when the sample length is considerably larger than the mean sizes of clouds and gaps (here we assume that 0 <

*p*< 1, so that

*L*and

_{c}*L*are finite), the distribution (C77) becomes very narrow, making both completely clear and overcast cases impossible:

_{g}*a*,

_{c}*a*→ ∞). Equation (D23) for the mean cloud fraction takes the form

_{g}*u*and

*υ*.

*ν*: