## Abstract

The proposed European Space Agency’s cloud profiling radar Millimetre Active Cloud Structure Imaging Mission is a nadir-pointing radar with a 1-km footprint; it will need to integrate the received signal power for a reasonable amount of time (1.4–14 s) along track to detect a cloud. As a result, the radar will provide a set of long (10–100 km) narrow pixels that are registered either as cloudy or as clear, depending on how much cloud there is in each. These are thus likely to give a biased estimate of fractional cloud cover over a region because the radar will be unable to detect small clouds or gaps. For a nadir-pointing radar the clouds and gaps essentially form a 1D sequence that can be modeled by a general single-server queue. This model allows analytical expressions to be found for the bias and sampling error in specific cases. Such expressions are presented for the two extremes of the Erlang queuing model, that is, for an exponential and a deterministic queue. These are then compared to results derived from satellite images of clouds.

## 1. Introduction

MACSIM (Millimetre Active Cloud Structure Imaging Mission), the proposed European Space Agency (ESA) cloud profiling radar, is to be nadir pointing and have a 1-km footprint; it will need to integrate the received signal power for a reasonable amount of time (1.4–14 s) along track to detect most radiatively significant clouds. As a result, the radar will provide a set of long (10–100 km) narrow pixels that are registered either as cloudy or as clear, depending on whether the pixel is cloud filled or not. These are thus likely to give a biased estimate of fractional cloud cover over a region that is dependent on the size of the pixel because the size and number of small clouds and gaps cannot be determined (Shenk and Salomonson 1972; Wielicki and Parker 1992; Luo 1995). In these and other studies (Randall and Huffman 1980; Ellingson 1982; Ramirez and Bras 1990), 2D images of clouds are simulated either by degrading existing satellite images of clouds or by seeding clouds randomly over an area and allowing them effectively to “grow” in size to give a distribution of cloud sizes. As this “seed” model assumes that cloud centers are uniformly randomly distributed over an area, it can be used (Pielou 1960; Diggle et al. 1976; Ramirez and Bras 1990) to investigate clustering and regularity in cloud fields (Weger et al. 1992; Zhu et al. 1992; Mapes 1993; Mapes and Houze 1993). However, a different approach to forming the cloud field is taken here, which exploits the sampling characteristics of the radar, and in which the field is generated only at the points observed by the radar. The radar is to fly on a low earth orbit satellite and will thus travel over the top of the clouds at high speeds (∼7–8 km s^{−1}) looking vertically downward. Thus, from the point of view of the satellite, the clouds and gaps between clouds will appear as a 1D sequence (in time) of clear and cloudy intervals following on from each other. This 1D property can be exploited by modeling the radar–cloud interaction in terms of a single-server queue with the radar being the “server,” the cloudy intervals being the service times, and the time between clouds being the interarrival times. These can be related to cloud and gap lengths by multiplying the duration of each by the speed of motion of the nadir point over the clouds. The values of mean service rate (equivalent to 1/mean cloud length) and mean arrival rate are fundamental to queuing theory (Conolly 1975) but will be meaningless if clouds are fractal (Lovejoy 1982; Joseph 1985; Cahalan and Joseph 1989; Zhu et al. 1992; Tessier et al. 1993; Kuo et al. 1993; Chatterjee et al. 1994). In such a case the cloud length distribution should follow a power law (of dimension between 1 and 2; Sreenivasan et al. 1989)and such distributions have no (infinite) mean. However, there is evidence that on a large number of occasions cloud sizes (Plank 1969; Wielicki and Welch 1986; Welch and Wielicki 1986) and cloud cluster sizes (Lopez 1977; Webster and Lukas 1992; Machado and Rossow 1993) do not follow a power law, with cloud sizes often following an exponential distribution. As a result we feel justified in assuming in many cases that finite values for mean cloud and gap lengths do exist and that a queuing model is appropriate in such cases.

One of the first to model queues (Conolly 1975, 2) at the beginning of this century was Erlang, who considered the behavior of telephone exchanges, which by their nature have only a finite capacity; and in his honor the ratio of mean service time to mean interarrival interval, although a dimensionless quantity, is measured in Erlangs. The family of distributions [as seen in Eq. (1)] that was used to model the system and that is closely related to the family of gamma distributions is also called Erlang. The first of these (as with the gamma distribution) is the exponential distribution and in the limit, with suitable choices of parameters, the Erlang distribution tends to the deterministic distribution. Thus, these two distributions are often taken to represent the two extremes of Erlang models for queues and we will consider both:

## 2. Cloud field models

### a. The exponential cloud field

We assume the cloud field (along the nadir track) to be exponential in form. That is, the length of a cloud or of a gap along the locus of the radar footprint can be considered as being “chosen” at random from exponential distributions of fixed mean length. Such distributions have the property of producing a small number of large clouds or gaps and a large (but finite) number of small clouds and gaps. In the notation of Kendall (Gorney 1981), this queuing model is written as M/M/1/1, that is, Markov (exponential) arrivals, Markov services, 1 server, and a maximum of 1 “customer” (cloud). The extreme to this model from above is the deterministic queue, which following Kendall’s notation is written D/D/1, with D standing for the deterministic; all the clouds are of the same size as are all the gaps, and by definition clouds arrive one at a time. The 2D version of this model, which consists of identical clouds (either cubes or right circular cylinders) whose centers form a regular lattice, is used by several workers to investigate the effect of broken cloud fields on the scattering of solar radiation (Aida 1977; Kite 1987; Breon 1992). Another closely related stochastic model for a broken cloud field is proposed by Su and Pomraning (1994), in which clouds of fixed chord length in an infinite atmosphere imply Markov (exponential) intercloud spacing. In terms of queuing terminology, this would result in an M/D/1 queue and clearly would provide values that are similar to those for the D/D/1 or M/M/1 queues considered.

If in the exponential cloud field we take the mean cloud and gap lengths as *C* and *G,* respectively, then one can generate the field by producing a sequence of cloud and gap lengths following Eqs. (2) and (3), in which *r*_{1} and *r*_{2} are random numbers between 0 and 1 chosen from a uniform distribution:

These fields can be used to conduct a sampling simulation to evaluate the errors in forming mean fractional cloud cover. However, it is possible to derive analytic expressions for such errors when sampling an exponential field by first considering the probability that it is clear or cloudy at a point at distance *x* along the subsatellite track. Such probabilities are usually written as *p*_{0}(*x*) and *p*_{1}(*x*), respectively, and as the field is exponential it can be modeled as a two-state Markov process (Cox and Miller 1965). These two states are cloud covered or cloud free; their respective probabilities are related by the coupled differential–difference equations given by Eqs. (4) and (5):

There is a clear symmetry in Eqs. (4) and (5) that is due to the fact that the sum of the two probabilities has to be equal to unity, which allows us to solve these equations as given in Eqs. (6) and (7):

If it is always clear or cloudy at *x* = 0 (either in the true cloud field or if in a simulation one starts with a cloud), the probability plots as a function of *x* will be of the form of Figs. 1a or 1b. These plots also show how the probability of a point being cloudy is affected by it being cloudy or clear nearby, a value that tends to zero or one as one approaches a cloudy or clear region. As one moves farther from *x* = 0, the probability that it is clear or cloudy tends to a constant value that is independent of the probability of it being clear at *x* = 0. These values are also by definition equal to the long-term fractional cloud cover, and to the fraction that is clear along the subsatellite track. If there is no reason to suppose that the probability that it is cloudy at *x* = 0 is specified, then one could assume that it is equal to the above long-term fractional cloud cover. This simplifies Eqs. (6) and (7), as the terms on the right-hand side involving *x* now become equal to zero. In this case, the probability that it is cloudy at a point is a constant independent of the position of the point and equal to the fractional cloud cover.

#### 1) The bias

We are now in a position to evaluate the bias in estimates in cloud cover if clouds smaller than the radar pixel length *L* cannot be detected. The above discussion shows that in an equilibrium state, the probability that it is cloudy at the beginning of any radar pixel is a constant equal to the true fractional cloud cover. For an exponential field, the probability that this pixel is registered as cloudy is given by Eq. (8):

Equation (8) results from the unique property of the exponential field of having no “memory” of what has gone before the start of the pixel. It follows immediately that the “long-term” mean cloud cover as derived from the radar falls exponentially with pixel length, as in Eq. (9):

Thus, from Eq. (9) this underestimate will be large if the mean cloud lengths are small or the pixel length is large. As an example, the estimated cloud cover as a function of pixel size is displayed in Fig. 2, with mean cloud and gap lengths the same as in Fig. 1.

The falloff of cloud cover with pixel size is demonstrated for the ATSR (Along Track Scanning Radiometer) satellite IR image of Fig. 3 in Fig. 4 (one of a number generated for an unpublished report to ESA by Astin and Latter). In this case, a pixel is registered as cloudy only if it is completely cloud filled and, as a result, the recorded cloud cover falls rapidly with increasing pixel length. Also shown in this figure are the best fits to this falloff by an exponential and a straight line. The exponential line is clearly a very good fit and hence from Eq. (8) the mean cloud length can be predicted as being 28 km. The ATSR image for this figure was chosen at random from a large number of such images and the exponential falloff in cloud cover is not atypical and occurs also in large number of Advanced Very High Resolution Radiometer and Geostationary Meteorological Satellite images studied by the author.

#### 2) The sampling error

It is clear from Eqs. (8) and (9) that the probability of any pixel being registered as cloudy is a constant for an exponential field and equal to the long-term mean fractional cloud cover as derived by the radar, given by Eq. (9). This is independent of the starting point of the pixel, and so even overlapping pixels are effectively independent samples of the cloud field. Instantaneously, pixels close together are correlated, but over a number of simulations (or far from *x* = 0) this disappears because the probability that it is cloudy at any given point does not depend on its position. Thus, the probability that *k* pixels out of *n* are registered as cloudy can be found from the binomial distribution; and the mean and variance for *k* are given by Eqs. (10) and (11):

Hence, the fractional cloud cover as derived from samples of *n* pixels, given by the ratio of *k* to *n,* has a mean and a variance defined in Eqs. (13) and (14):

These equations show that the expected value of recorded cloud cover is again biased with the same value as derived before for an infinite number of samples and that its variance is inversely dependent on the number of samples. This is displayed in Fig. 5, which shows the sampling error and total mean square error for a pixel of length 10 km as they depend on the number of pixels sampled; this is the same as for the cloud field given in Fig. 1.

### b. The deterministic cloud field

#### 1) The bias

The other extreme of an exponential queue is that of a deterministic (D/D/1) queue in which the clouds and gaps are fixed with lengths *C* and *G.* To find the bias consider a region of length *m*(*C* + *G*), where *m* is a whole number and choose a point *x* at random from the interval (*x* ∼ *U*[0, *m*(*C* + *G*)), then the probability that the whole of the pixel [*x, x* + *L*) is cloud covered is given by Eq. (15):

Hence, in a deterministic cloud field one would expect the mean cloud cover derived from the radar to fall linearly with pixel size at a rate inversely dependent on the sum of the mean cloud and gap lengths, whereas in an exponential field the falloff is dependent only on mean cloud length. The mean cloud length can be found by fitting a straight line to the plot of bias against pixel size and is equal to the intercept divided by the slope as displayed in Fig. 6.

#### 2) The sampling and total error in a deterministic field

If the field is set up as for the exponential field by assuming that *p*_{1}(0) is equal to the fractional cloud cover and also that if it was cloudy above *x* = 0, then the position of the beginning of the cloud is drawn from a uniform distribution then each pixel is again effectively independent of each other. The sample and total mean square errors will then be given by equations identical to those for the exponential field but with the value of *p* (which is also equal to *f*_{bias}) given by Eq. (15) (see Fig. 7). If, however, this is not the case, then the measured cloud cover will depend on the sampling pattern, cloud and gap repeat distance, and the position of the start of the first cloud after *x* = 0.

## 3. The effect of thresholds

In the previous section it has been assumed that if there is any clear area within the pixel, then it will be recorded as clear and thus be recorded as cloudy only if completely cloud covered. The question now considered is what happens to the recorded cloud cover if at least a fixed fraction of the pixel’s length has to be covered by cloud in order for it to be registered as completely cloud filled. Clearly, the observed cloud cover could in this case have taken on any value between 0% and some maximum dependent on the threshold and so could be greater than the true cover, an event that is not possible if the whole pixel has to be cloudy.

### a. The deterministic field

Consider first the deterministic cloud field and a threshold fraction *f* that denotes the cutoff between a pixel being registered as cloudy or clear. There are a number of possibilities depending on the size of the pixel *L* relative to *C* and *G.* The first case to consider is where *L* is less than both *C* and *G.* As before, assume that point *x* is chosen at random from [0, *m*(*C* + *G*)), then the probability that the pixel [*x, x* + *L*) is covered by cloud for more than the fraction *f* of its length is given by Eq. (16):

This can be written as

The measured cloud cover is again linear in pixel length, but the slope now also depends on the fraction *f.* If *f* = ½, then the cloud cover estimate is unbiased, irrespective of pixel length (remembering that the pixel length is restricted to being less than both *C* and *G*) and is also independent of cloud and gap size. If *f* is greater than one-half, then the measured cloud cover is smaller than the true cover; if *f* equals unity, Eq. (15) is recovered; and if *f* is less than one-half, the recorded cloud cover is greater than the true cover. However, if the pixels are unrestricted in size, then there is a pixel size beyond which the recorded cloud cover jumps to 0% or 100%, depending on the value of *f* and the ratio of the mean cloud to gap length (see Fig. 8).

### b. Exponential field

In the deterministic case there is a value of threshold *f* (*f* = ½) for which the recorded cloud cover is unbiased compared to the true cover. That is, if the satellite instrument (radar in this case) is sufficiently sensitive to record a return for all pixels that are greater than half full of cloud but not sensitive enough to detect a return if a pixel is less than (or equal to) half full, then the recorded cloud cover is unbiased. Thus, in such a case the sampling error and total error are equal to each other and so both fall to zero as the number of samples increases.

This raises the question as to whether there is such a value for an exponential field. The simple answer is no, as the probability distribution of the total amount of cloud within a given length, *L,* depends on *C, G,* and on *L.* This follows from the work (on queuing) by Conolly (1971), which is in turn derived from earlier work of Takacs (1957) and can be used to find the distribution of total cloud amount within a fixed length. Also given in Conolly (1971) is the asymptotic probability for the total cloud length to exceed a fixed fraction of the pixel length. As the size of the pixel tends to infinity this asymptotic limit is found to be 0, ½, or 1. Thus, depending on how *f* compares to the true cloud cover, the limiting observed cloud cover will be 0%, 50%, or 100% as the pixel size increases, though this convergence may be slow. Thus, asymptotically the only occasion when using a threshold of one-half would give an unbiased estimate of cloud cover for an exponential field is if the cloud cover is equal to one-half; see Fig. 9. For all other cloud covers, the recorded cloud cover will tend to 0% or 100%, as the pixel size increases if a threshold of one-half is used. This is also the case for any other threshold value except where the threshold equals the true cloud cover where it always tends to one-half, see Fig. 10. Thus, for any pixel size and cloud and gap size, one can find a threshold value that gives unbiased cloud cover estimates, but only for equal cloud and gap means is this threshold (in the limit) independent of the pixel size. If *f* is very close to zero then, from above, as the pixel size increases, the recorded cloud cover will tend to 100%, irrespective of the true cloud cover. Hence, the range of possible cloud cover increases with pixel length. Thus, it is to be expected that the recorded cloud cover is going to be more sensitive to the threshold used if the pixel size is increased; the effect of this can be seen in Fig. 10. Also from Fig. 10, for high sensitivities, the recorded cloud cover will tend to be larger for large pixels than smaller ones, whereas for low radar sensitivity the opposite is the case. Thus, at least qualitatively this model reproduces the above effects of sensitivity threshold and pixel size on cloud cover that were as observed by Wielicki and Welch (1986) when studying Landsat data. The difference between their study and this, apart from being derived from satellite images rather than a model, is that their clouds have a range of reflectivities, whereas all clouds are implicitly given the same reflectivity in this study.

## 4. Conclusions

If a cloud radar will only detect clouds that are longer than a certain length, then small clouds will be missed and the true cloud cover will be underestimated. This manifests itself in returned cloud cover falling exponentially with pixel size in an exponential field and linearly in a deterministic field. Also for exponential fields, individual pixels (of constant size) behave as if they were independent samples, provided there is no point in the region sampled that has a greater probability of being cloudy than any other. If this latter statement is not the case, then obviously the probability of it being cloudy or clear at any point is not independent of position and will be as defined by Eqs. (6) and (7). The equations in this case for the bias and sampling error will not be as simple as presented in the above discussion. However, these equations are the limiting case for a system in equilibrium. In this equilibrium state, the mean square sampling error is bounded and depends inversely on the number of samples and thus can be made as small as required. However, because of the measurements being biased, the mean square error about the true mean is bounded below and so for even moderately large sample or pixel sizes the bias dominates the sampling error.

If only a certain fraction of a pixel has to be cloud covered before it is registered as cloudy, then there are regimes of pixel size and allowed gap length in which the cloud cover may be overestimated, underestimated, or unbiased in both the exponential and deterministic cases. In the deterministic case there is a threshold value, equal to one-half (of the pixel size), for which the estimated cloud cover is unbiased for small pixel sizes. There is no such value in the exponential case except in the limit for equal cloud and gap size means, and thus one would expect the cloud cover from an exponential field always to be biased, and can be either too high or too low. In the extreme of large pixel sizes, the estimated cloud cover in the exponential field tends to 0%, 50%, or 100%, depending on whether the value of the true cloud cover is smaller, equal to, or larger than the fractional threshold, and to 0% or 100% for the deterministic case.

## Acknowledgments

This work was supported under NERC Contract F60/G6/12 and ESA Contract 11326/95/NL/CN. Appreciation is extended to Barry Latter (of ESSC) for his contribution to the sampling study involving the ATSR image and to Professor R. Gurney, J. J. Settle (both of ESSC), and the anonymous reviewers for their valuable comments.

## REFERENCES

## Footnotes

*Corresponding author address:* Ivan Astin, NERC/ESSC,University of Reading, Whiteknights, Reading RG6 6AB, United Kingdom.

Email: iva@mail.nerc.essc.ac.uk