Feature Extraction from Whole-Sky Ground-Based Images for Cloud-Type Recognition

Josep Calbó Department of Physics, and Institute of the Environment, University of Girona, Girona, Spain

Search for other papers by Josep Calbó in
Current site
Google Scholar
PubMed
Close
and
Jeff Sabburg Department of Biological and Physical Sciences, Faculty of Sciences, University of Southern Queensland, Toowoomba, Queensland, Australia

Search for other papers by Jeff Sabburg in
Current site
Google Scholar
PubMed
Close
Full access

Abstract

Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques.

Corresponding author address: Dr. Josep Calbó, Dept. of Physics, University of Girona, Campus Montilivi, EPS-II, 17071 Girona, Spain. Email: josep.calbo@udg.es

Abstract

Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques.

Corresponding author address: Dr. Josep Calbó, Dept. of Physics, University of Girona, Campus Montilivi, EPS-II, 17071 Girona, Spain. Email: josep.calbo@udg.es

1. Introduction

Clouds are a major meteorological phenomena related to the hydrological cycle and affect the energy balance on both local and global scales through interaction with solar and terrestrial radiation. It is broadly recognized, for example, by the Intergovernmental Panel on Climate Change (IPCC), that clouds (and cloud–aerosol interaction) are responsible for the largest uncertainties in climate models and climate predictions (Houghton et al. 2001; Carslaw et al. 2002). In addition, clouds affect our everyday lives, for example, by modifying the amount of ultraviolet (UV) radiation that reaches the earth’s surface (Calbó et al. 2005). Most cloud-related studies require some sort of cloud observations, such as the amount and type of clouds that are present. Additionally, research involving natural illumination generally requires knowledge about clouds and their position with respect to the sun’s disk (e.g., Chen et al. 1994). These macroscopic observations have been performed historically by human observers who recorded cloud cover, cloud type, and, in some cases, the sun-disk condition (WMO 1975, 1987; Gedzelman 1989). However, high costs associated with human observers have led observation programs toward automatic devices to detect and quantify cloud amount and type.

There is, of course, satellite information, but satellite retrievals have known weaknesses in quantifying small and/or low-altitude cloud features due to their limited spatial resolution and unknown surface influences on the measured radiances. Then there are issues with differences in viewing geometry between satellite- and ground-based sensors associated with cloud features (e.g., Goodman and Henderson-Sellers 1988). One sky characteristic difficult to determine from a downward-pointing satellite sensor is solar obstruction, this being of utmost importance in radiation/cloud studies (Long et al. 2006). There are numerous textbooks that discuss, in one form or another, aspects of meteorological satellites that are used to qualitatively or quantitatively describe sky characteristics (e.g., Mason and Hughes 1998). Parisi et al. (2004) devote one subsection of their textbook to satellite-based cloud detection, with an emphasis on solar radiation applications. Although new satellite-based cloud-detection systems are in the “pipeline” (e.g., Triana; information available online at http://toms.gsfc.nasa.gov/future/triana.html), it is still necessary to complement, as well as to provide ground-truthing, for spaceborne cloud detection sensors.

One option for obtaining continuous information on sky conditions at a local scale is the use of sky-imaging devices. A number of papers (Long et al. 2006; Souza-Echer et al. 2006; Kassianov et al. 2005; Lu et al. 2004; Pfister et al. 2003; Shields et al. 2003; Martins et al. 2003; Crawford et al. 2003; Sabburg and Wong 1999) demonstrate the increased number of ground-based sky imagers being developed and used in several countries. This development is partly due to the dramatic improvements in technology in recent years, with respect to both the hardware [e.g., charge-coupled devices (CCDs)] and digital image processing (DIP) techniques. Among the most well-known instruments of this kind is the family of whole sky imagers (WSIs), developed by the Scripps Institute of Oceanography at the University of California, San Diego. They are designed to measure radiances at distinct wavelength bands across the hemisphere (Shields et al. 2003). The WSI, besides being used for many other interesting scientific capabilities, can estimate fractional sky cover (Tooman 2003; Johnson et al. 1989). Unfortunately, because of the high-quality components and sophisticated engineering involved, the cost of this instrument puts it beyond the means of many research groups whose main interest lies in inferring daytime cloud macroscopic properties. Commercially, there are very few nonradiance sky camera systems available. One of the better known sky imagers to the atmospheric science community is the total sky imager (TSI), manufactured by Yankee Environmental Systems, Inc. (YES, Turners Falls, Massachusetts).

A recent paper (Long et al. 2006) reports DIP techniques that can be used to obtain fractional sky cover and other sky characteristics from any ground-based, all-sky camera image. Specifically, two sky cameras are described in detail in that paper, the TSI-440/880 and the whole sky camera (WSC), the latter developed by the University of Girona (Spain) (Pagès et al. 2002), and particular attention is put on the need to cater to issues such as geometrical correction and dynamic range. One conclusion from the paper stresses the fact that state-of-the-art methodologies [such as the ones used for the TSI images, described in detail, e.g., in Pfister et al. (2003)] can obtain fractional sky cover with little uncertainty (i.e., as little as the uncertainty of classical human observations). The fractional sky cover estimation described is based on thresholding the image according to the ratio of red and blue intensities of the red–green–blue (RGB) image. This is in contrast to Martins et al. (2003) and Souza-Echer et al. (2006), who suggest an alternative way to find the fractional sky cover based on the saturation component of the hue–saturation–intensity (HSI) transformation of the original RGB image. Another conclusion from Long et al. (2006) relates to the difficulty in obtaining cloud characteristics that can be used to infer classical cloud types. Consequently, the paper suggests that further research is required to describe cloud types from sky images.

Few previous works are found that deal with image processing of sky images for obtaining characteristics that might be used for cloud-type classification. One exception is the paper by Singh and Glennen (2005), in which hundreds of features from the grayscaled images are investigated to classify sky conditions into five different classes for air traffic applications, with results that were judged as only modest by the authors themselves. On the other hand, Buch et al. (1995) explain a binary-tree scheme to classify WSI images into five types: clear, stratus, cumulus, cirrus, and altocumulus. They suggest some texture-related features and obtained a misclassification rate of 39% of their test pixels when compared with visual classification. Other papers (Allmen and Kegelmeyer 1996; Seiz et al. 2002; Kassianov et al. 2005) are focused on cloud-base height (CBH) estimation, which is of course related to cloud type. However, all the latter papers make use of two simultaneous images taken at two sites, in close proximity, for deriving CBH. Finally, there are other works that show additional DIP techniques to obtain other characteristics of the sky images, such as solar obstruction by clouds or cloud brokenness (Sabburg and Long 2004).

In this paper, we present several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image, and, finally, others are computed from the thresholded image (i.e., the image in which cloudy pixels are distinguished from clear-sky pixels). These features, as well as the sets of images used for cloud-type classification from two different cameras, are presented in the next section (section 2). In section 3, we present a possible automatic classification method based on the most suitable features, and we apply this method to two series of images. Finally, in section 4 we summarize the conclusions of this research and suggest possible future investigations on cloud-type identification from ground-based whole sky images.

2. Data and methods

a. Sky cameras and associated images

The development of our methodology for feature extraction and cloud-type classification has been performed on a number of images taken by two different sky cameras, a TSI and the WSC. The WSC consists basically of a digital color video camera mounted with a fish-eye lens [180° field of view (FOV)] pointing to the zenith. The camera is protected against the environmental factors, and a shadow band is used to avoid direct incidence of the sunbeam on the lens. This camera is installed on the roof of a university building in Girona (41.97°N, 2.82°E; 100-m altitude).

The TSI consists basically of a digital color video camera mounted to look down on a curved mirror to provide a horizon-to-horizon view of the sky (effectively a 160° FOV). The mirror rotates to keep a dull black strip on the mirror aligned with the solar azimuth angle to block the direct sun from the camera. The TSI used in this study is located atop a four-story building, with no surrounding hills or trees affecting the FOV, at the campus of the University of Southern Queensland (USQ), Toowoomba, Australia (27.5°S, 151.9°E; 693-m altitude). More details on these two cameras can be found elsewhere (Long et al. 2006).

The WSC takes one image at constant time intervals (currently 1 min). Images have a resolution of 768 × 576 pixels, and those used in this work were originally stored as BMP images. However, we have already shown (Long et al. 2006) that JPEG compressed images might be better suited for DIP; therefore, here we have converted the BMP images into JPEG by setting the compression ratio to 80%. The TSI in question takes one image at constant time intervals (recently changed from 5 to 1 min). Images have a resolution of 352 × 288 pixels and are stored as JPEG images. At the present stage of our research, we would like to focus on feature extraction of the relevant parts of the image for cloud-type identification. In other words, we would like to avoid problems related to the presence of the border of the image (horizon) or of the shadow band. Therefore, we used sections of the whole image that, for convenience, were defined as squared regions of 256 × 256 and 78 × 78 pixels, respectively, for the WSC and the TSI. This is approximately 25% and 9% of the area of the sky images. To use the same computer code for both sets of images, TSI images were resized to 256 × 256 pixels by using the bilinear interpolation option of the resizing utility in the MATLAB software package. Some samples of entire WSC images and corresponding squared sections, used in the analysis presented in this paper, are shown in Fig. 1.

Two different sets of images were used for the two stages of this present work. First, a limited number of images were selected by visual screening and based on the fact that they were good samples of different cloud types that we would like to identify (see more details below). These sets of images, called WSC1 and TSI1, correspond to different times in the day and days in the year [i.e., different solar zenith angle (SZA) conditions], and their square parts were taken from different sectors of the whole image. These images were used for developing the methodology. Second, all noon images available during a year (November 2001–October 2002 for the WSC and July 2003–June 2004 for the TSI) were used to assess the methodology. These latter images cover also a broad range of SZA (approximately 20°–60° for WSC2 and 4°–50° for TSI2), and the square sections were always taken from the northern sector of the image for the WSC and the southwestern sector for the TSI. Since the WSC2 and TSI2 sets are built from 1 yr of images from two different sites (from the climatic point of view), these two sets guarantee a broad range of different sky conditions to be identified.

b. Sky conditions and cloud types to be identified

The sky can present an infinite number of different “aspects,” depending on a number of factors. Even the definition of a cloud is sometimes problematic, especially when different observational devices (e.g., ceilometers, radars, visual observation, satellite images) are used to detect clouds. It is also well known that two trained observers may sometimes disagree on the amount and/or type of clouds present in the sky, particularly if they are assessing a situation away from their usual observation site. Therefore, it is reasonable to start our research by defining some sky conditions or cloud types that are quite easily distinguished by visual observation and try to develop a method for automatic recognition of these simpler cases.

After inspection of both TSI and WSC images, and also based on previous works (Buch et al. 1995) and on our own experience, we defined eight different sky conditions to be used as a basis for our classification methodology (see Table 1). We purposely avoided having classes with a complex mixture of cloud types. Note also that some cloud types can be found in more than one of our classes. Therefore, it is obvious that we are not trying to classify the images into the classical 10 cloud genera but into different sky conditions that are related to the visual aspect of the clouds present. These different conditions might be useful for radiative transfer studies or for comparison with observations from satellites.

c. Image features used for cloud-type recognition

1) Statistical texture features

A frequently used approach for texture analysis of images is based on statistical properties of the histogram of the image. This can be applied to grayscale images, that is, images whose pixels are described by a single value. Since our images have originally three components (red, green, blue), we need to transform them in some way to grayscale images. Two transformed images have been used: the image of the red-to-blue components ratio (R/B) and the image of the intensity values, defined as one-third of the sum of the three color components (see Fig. 1).

The statistical features that have been checked are the following (González et al. 2004):

  • Mean (ME):
    i1520-0426-25-1-3-e1
    where z is the variable that indicates the values in the image (R/B, or intensity in our case), p(z) is the frequency distribution of these values in the image, and L is the number of possible levels of z.
  • Standard deviation (SD):
    i1520-0426-25-1-3-e2
    The standard deviation is a measure of contrast in the image.
  • Smoothness (SM):
    i1520-0426-25-1-3-e3
    where the variance σ2 is defined here as σ2 = SD2/(L − 1)2. Values of SM result in the range [0, 1]: SM is 0 for an image of constant values, and 1 for an image with large variability.
  • Third moment (TM):
    i1520-0426-25-1-3-e4
    which measures the skewness of the histogram.
  • Uniformity (UF):
    i1520-0426-25-1-3-e5
    This feature is maximum when there is only one gray level present in the whole image and minimum when a large number of levels are present in the same amount of pixels.
  • Entropy (EY):
    i1520-0426-25-1-3-e6
    which is a measure of the randomness in the level values of the image.

2) Pattern features based on the Fourier spectrum

All the above features might be useful for image characterization, but note that none of them deals with any kind of pattern, or shape, that is present in the images. To take into account the cloud patterns to some extent, we have used the Fourier transform [through the fast Fourier transform (FFT) algorithm] of the images. Indeed, the analyses of images in the frequency domain (i.e., the result of the Fourier transform) should be a useful tool to find differences among them, as explained, for example, by González el al. (2004). The process to obtain the spectral power images (such as the ones shown in Fig. 1) has several steps, as follows. First, to avoid large spectral power at the 0 wavenumber, we subtracted the average of the values in the image from all pixels. Second, a multiplying filter, having a value of 1 over most of the domain and a cosine shape at the borders, was applied to reduce the variance introduced by the edge discontinuity. Third, the 2D FFT routine, as provided by the MATLAB software package, was applied. The result is the complex amplitudes of the harmonics that correspond to each wavenumber (in the two directions). The modulus of these complex amplitudes is known as the spectral energy function. Normalizing this function by the size of the image, we obtained the spectral power function, which is the basis for further analyses. Note that since the initial values (i.e., both the R/B and the intensity images) are real, the spectral power is a symmetric function with respect to the origin.

Spectral power functions corresponding to different sky images look, of course, different among them (see Fig. 1). However, we need to extract some simple characteristics of the spectral function, in order for them to be useful for cloud type recognition. After testing several options, based on previous works (Garand 1988; Salvador et al. 1999; Sabburg and Wong 1999), we decided to use two features, which we called correlation with clear (CC) and spectral intensity (SI), respectively.

The feature CC quantifies the similarity between the spectral power function corresponding to any image and the spectral power function for a clear-sky image taken as reference. Specifically, the value of CC is the linear correlation coefficient between the logarithms of the two spectral power functions. Obviously, CC is always a value in the range from 0 to 1, and the higher the value of this feature, the more uniform the aspect of the sky.

The SI takes into account the distribution of spectral power along a range of wavenumbers. That is, depending on the patterns of the clouds present in the images, there will be more or less spectral power in particular wavenumbers. To try to quantify this effect in a single value, we have proceeded as follows. First, we define the accumulated spectral power E*(k1, k2) between two wavenumbers k1 and k2 (k1 < k2) as
i1520-0426-25-1-3-e7
where S(kx, ky) is the spectral power function, and the dependence on the wavenumbers in both directions has been made explicit. Second, we define and compute a spectral power ratio R as
i1520-0426-25-1-3-e8
where kmin = 1/256 (because this is the reciprocal of the image size), and kmax = 1/2. Note that the spectral power function has also a value for kx = ky = 0, but this corresponds to the “direct current” component in the image, that is, to its mean value, which has been set to 0 in the treatment previous to the application of the FFT. Then, if we plot the values of the logarithm of R versus the wavelength λ (i.e., the reciprocal of the wavenumber), it turns out that there exists an approximate linear relation. The SI feature corresponds to the absolute value of the slope of the corresponding regression line, forcing it to cross at the point [λ = 2, log(R) = 0] and restricted to λ ≤ 64, to avoid giving too much weight to longer wavelengths. Figure 2 shows the plots of the spectral power ratio R versus the wavelength λ for two of the images shown in Fig. 1, contrasting the linear relation and the differences of slope (i.e., of SI feature), depending on the image.

3) Features based on the thresholded image

All the features described above are computed by considering all image pixels, without distinguishing explicitly between cloudy pixels and clear-sky pixels. Therefore, it seems plausible that adding some other features that make this distinction could help in the sky condition recognition. Distinction between cloudy and clear pixels is made here by thresholding the R/B image in a way that is similar to that stated in our previous work (Long et al. 2006). The value of the threshold, however, is computed here by the “greythres” routine included in MATLAB (which “guesses” the threshold based upon the histogram of the image). A restriction is added to keep the threshold between some specified values (0.6–0.64 for WSC images and 0.56–0.64 for the TSI images). Unlike the procedure that is applied by the standard TSI software on its images, the value of the threshold does not depend on the relative position of the pixel in the image. This should not cause further problems, since we are analyzing sections of the whole image, and these sections are in general away from the most “troublesome” areas (i.e., around the sun or close to the horizon).

Once the threshold is applied to the image, the determination of the fractional sky cover (FSC) for the selected image (i.e., a region of the whole sky image) is straightforward, by dividing the number of cloudy pixels by the total number of pixels. No geometric correction is applied in this work, since its goal is not an accurate estimation of FSC [more on the effect of geometric correction to optical distortion can be found in Long et al. (2006)]. Then, cloud brokenness (CB) is accounted for by dividing the number of pixels on the perimeter of the cloudy areas by the number of cloudy pixels. Since the borders of the image are considered as cloud borders for a totally overcast image, CB is always greater than 4N/N2 = 4/N (where N is the size of the image, 256 in our case). Despite this issue, CB is minimum for overcast skies and maximum for skies with “patchy” clouds. Finally, we also compute the mean R/B value for the pixels that are set as cloudy. This feature (TH) should be related to cloud thickness, since, in principle, the thinner the cloud, the bluer its color. That is, we expect to find lower values of TH (although always greater than the used threshold) for thin clouds than for thick clouds.

3. Results and discussion

a. A classifier based on the features

We have computed the values of all the above features for each image in the training sets (WSC1 and TSI1). Previously, all these images were classified (by visual inspection by the two coauthors of this paper) in one of the eight sky condition classes of Table 1. A summary of the obtained values is shown in Table 2. Specifically, in this table a range of values is given for each feature and sky condition. The range is obtained as the average of the values for all images in a class (usually four to five images) ±1.5 times the standard deviation of these values. In some cases, the range has been narrowed to take into account some physical constraint (e.g., SM, CC, and FSC, must be between 0 and 1). Also, we found a systematic difference for some features when computed for TSI images or for WSC images. In these cases (SD, SM, SI) we adjusted the TSI values to make their average equal to the average of the corresponding WSC values. The origin of these differences, affecting only some features, is most likely related to the differences between the sky cameras, more specifically, to the CCD sensitivities. Note that the ranges of values of the diverse features are in general well in agreement with what was expected for the different classes, at least in relative terms.

Two main facts become apparent after analyzing the results in Table 2. First, some features are strongly correlated, and therefore they do not provide additional information for cloud-type (or sky condition) recognition. In particular, most textural features computed from the R/B image are highly correlated with their corresponding feature computed from the intensity image. Second, the ranges of values of some features are so wide that these ranges, corresponding to different sky conditions, overlap. As a consequence, these features are hardly suitable for distinguishing between sky conditions. An example of this latter problem is feature TM, where the range for class D (−0.092, 0.052) covers all the other ranges of values for other classes.

From the above results and considerations, we selected a number of features that seem the most adequate to classify an image. These selected features are ME, SD, SM, EY (from the R/B ratio), EY (from the intensity image), CC, SI, FSC, and TH (from FFT and thresholding). Then, the next step toward cloud-type recognition is to develop an algorithm (the classifier) to use the values of these nine features from a new image in order to classify the image into one of the eight sky conditions.

The classifier that we developed is based on the supervised parallelepiped technique, which has been used elsewhere for similar applications (Duchon and O’Malley 1998; Souza-Echer et al. 2006). One example of this technique is illustrated in Fig. 3, in which only two of the features are represented. The range of possible values for each sky condition, and for the two features SI and CC, is shown by the rectangles (i.e., two-dimensional parallelepipeds). In Fig. 3, it becomes apparent that some sky conditions can be readily distinguished by the use of only these two features. For example, an image belonging to class E will never be confused with an image belonging to class A, D, or K. However, depending on the values of these two features, sometimes we would not be able to distinguish between classes E, G, H, clear, and C. Fortunately, we still have other features to be used to distinguish between these classes.

More formally, the classifier can be described as follows. An image Xi is characterized by a series of values fi,j, where j indicates each of the nine features explained above. Then a test is applied to each of these values:
i1520-0426-25-1-3-e9
where minkj and maxkj are the minimum and maximum values that define the width of the parallelepiped for each feature j and sky condition class k. These values are the values given in Table 2. Note that for two features (CC and FSC), the ranges used in the test were later made wider than those resulting from the average ±1.5 times the standard deviation of feature values (from training images) to allow more images to be classified, as required for statistical purposes. Subsequently, the classifier initially assigns the image Xi to a specific class K if
i1520-0426-25-1-3-e10
That is, the classifier assigns an image to the class where the features computed for the image fit better. It is not necessary that all tests be TRUE, but only that the number of TRUE tests be greater than the number for another class. Note that the number of features used in the above step of the classifier is Nj = 8, since feature TH is used later to distinguish among some particular classes.

Steps described by expressions (9) and (10) are the basis of the classifier. However, some further considerations are taken into account to improve its ability to correctly classify sky images. For example, for most classes an additional condition is set to ensure that if an image is assigned to that class, at least one of the tests for the two features, FSC or CC, gives a TRUE result. This additional constraint reflects the fact that FSC and CC are probably the “best” two features to distinguish the sky conditions described in Table 1. Finally, for an image to belong to classes G or H, its TH feature must be less than the maximum value of the corresponding ranges (see Table 2).

After this process, any image should be classified into one of the eight sky conditions [i.e., into the class whose corresponding Si*k is maximum, where S* means the result of Eq. (10) is modified by the additional conditions explained in the previous paragraph]. However, note that we need to compute the value of Si*k for all classes; therefore, it is quite easy to assign a probability of belonging to a class (instead of a unique assignment) based on the values of Si*k. On the other hand, some images may not be classified; that is, Si*k are null for all k (i.e., for all classes). This may happen because of the already-mentioned additional conditions. Finally, an image may be classified into two or more classes, if eventually the values of Si*k for several classes are equal.

b. Assessment of the classifier

To assess the performance of the above described methodology (the classifier), we used the two additional sets of images already described in section 2a (TSI2 and WSC2). First, the two coauthors inspected all these images (recall that these are regions of the whole sky images) independently and labeled them according to the eight possible sky conditions (manual classification). Note that in some cases there was a disagreement between the sky conditions set by each author. Instead of trying to reach an agreement, we discarded these cases from our further analysis. Some other images were also discarded because of the presence of glare or other undesirable effects. Therefore, the total number of images finally used in the performance analysis was 395 (196 from the TSI2 and 199 from WSC2). Then, the automatic classifier was applied on these sets of images. The result of the classification (compared to the manual classification) is shown through the contingency matrices (sometimes called confusion matrices) detailed in Table 3.

Note that an alternative assessment methodology could have been used, based on an a posteriori visual inspection of images, instead of our a priori methodology. In the former case, all images would have been manually (visually) inspected after they had been classified and labeled as “correct” on “incorrect.” We anticipate that this method would have led to a better accuracy index, since most images that were not included in the analyses (because of the disagreement between the two coauthors) would have been accounted as “correct” images. The reason is that for a complex image that is difficult to assign to a particular sky condition, a human observer would tend to accept as correct the automatic (objective) classification, as long as this automatic classification is one of the possibilities considered by the observer.

The agreement (accuracy) of the classifier, computed as the correctly classified images over the total number of images, is 69% and 54% for the TSI2 and WSC2 sets, respectively. The joint agreement, that is, the weighted average of these two values, is almost 62%. Since most “confusions” are found between classes A, C, and D (in both sets) and between classes G and H (in set WSC2, since no images belonging to these classes are available in set TSI2), in Table 4 we show a new contingency matrix where these classes are accounted together, and also both sets of images are included. For this case, the accuracy index is 76%. Note that if we do not take into account the “Clear” class (which is the most populated), the accuracy of classification into the other four sky conditions is 71%. The union of classes A, C, and D is justified because these three classes correspond actually to the presence of cumuliform clouds (basically, Cu and Sc); similarly, the union of the two classes that correspond to cirriform clouds (G and H) is also logical. The accuracy index obtained is quite satisfactory given the simplicity of the classifier and the assessment methodology that was used.

From a detailed analysis of Tables 3 and 4, it becomes apparent that some confusions are more common than others. First, a number of images visually identified as “Clear” are classified in classes G and H. The origin of this problem is a well-known issue (see Long et al. 2006), namely, the “whitening effect” of the sky in some areas of the sky dome, particularly around the sun disk and when the atmosphere presents a high aerosol load. In these cases, a fraction of pixels are set as cloudy, and, of course, these false clouds resemble very thin clouds, that is, cirriform clouds that are typical for classes G and H. Fortunately, there exists a possible solution for this whitening effect, which needs a sequential (1 min) series of images, since it is based on the variability of cloud fraction in the complex areas (Long 2005).

Second, several images that should be classified as A (according with the visual inspection) are automatically classified as G or H. This usually happens when the cloud fraction in the image is in the low end (i.e., less than 0.30) for images that produce a CC feature out of the range assigned to class A. Probably this confusion could be solved by using other features, such as, for example, the mean intensity of the cloudy pixels. Third, there are some images that correspond to overcast skies (class E) that are classified as class K. After checking these misclassified images, we found that all of them were WSC images that showed raindrops. Therefore, we understood that the patchy pattern in the image due to drops on the camera dome produced values for features CC, SD, and EY similar to what is expected for a real mottled field of clouds corresponding to class K. The fourth more common confusion is for images that are visually identified as containing cirrus clouds (G and H) but that the classifier assigns to “clear” or C classes. The explanation here is almost the reverse of what is said for the first two confusions. On the one hand, if cirrus are extremely thin, pixels might have the R/B ratio below the threshold used, so they are not detected as clouds. On the other hand, if they are thick and white enough, some features may correspond to the typical values for class C.

Finally, there is the issue with images a priori classified as K. Only one out of 15 images is correctly classified; two other images are classified with the same probability as belonging to K class or to another class (C). Some K images are classified in class G, probably due to the presence of what a human observer identifies as cirrocumulus clouds, while the classifier is not able to distinguish them from other cirriform clouds. Other K images are classified as C: in this case, the problem is the difficulty in distinguishing (even sometimes for a human observer) between altocumulus and cumulus or stratocumulus. The problem with class K comes from a selection of training images that it is not representative enough of the sky conditions that we would like to have included in this class. Indeed, all the training images were examples of nice mottled skies with cirrocumulus, but this type of sky is rarely found, either in WSC or in TSI images.

4. Conclusions

We have shown in this paper that the recognition of different cloud types from processing digital images taken by sky cameras is possible. Our results open an interesting field of research that must be further explored in areas of both defining features that can be related to cloud-type recognition and developing more complex algorithms for automatic classification of images according with its cloud type.

Three different kinds of features have been explored in this work: statistical features, features based on the Fourier transform of the image, and features that need the distinction of cloudy pixels and sky pixels. Although the fast Fourier transform was initially considered as a very promising approach to cloud-pattern recognition, difficulties arise when we try to summarize in a limited number of values the information contained in the FFT. It is possible that other features could be suggested based on the spectral power function of an image, besides the two that we have used here (the correlation with the corresponding function of a clear-sky image, and a measure of the shape of this function, the so-called spectral intensity). Regarding features defined after a threshold is applied for distinguishing between cloudy and clear pixels, we must mention that better results could be obtained if we had used a variable threshold along the image, depending on the pixel position. It has been previously shown that the suitable threshold in the sun-disk area, or in the horizon area, is different than the threshold used in other areas of the sky.

As far as the automatic classification of images is concerned, we have presented a simple methodology based on an a priori assessment analysis, giving an interesting index of agreement of 62% when eight sky conditions are considered. This index increases to 76% when some of these conditions are jointly considered, producing five different sky conditions; each of these latter conditions is easily related with typical cloud types [clear, low cumuliform, stratiform (overcast), cirriform, and mottled (altocumulus, cirrocumulus)].

To improve the results reported here, there are at least two different possibilities (which can be applied simultaneously). One is the use of a “running” window through the whole image, which would produce a number of guesses (one for each window analyzed) for a single image, and then one could consider two options: take as correct the most usual guess or identify different cloud types in the same image. The other possibility is the use of more sophisticated classification techniques, such as maximum likelihood methodologies or neural networks. In any case, all new developments should be adequately assessed by comparing with visual inspection of whole images; when possible, images from different cameras and/or different climates should be used in the assessment.

Note that with the results shown here, the amount of information that can be supplied by analysis of whole sky images is becoming quite large. Besides the classical product of fractional sky cover, and other information such as cloud brokenness, the visibility of the sun disk, and cloud thickness (all of them already covered by previous literature), now cloud type can be estimated. The combination of all this information gives an improved picture of the sky condition, which is very useful for research topics that involve cloud effects on radiative transfer in the atmosphere, such as cloud radiative forcing and its changes in time, and cloud effects on solar radiation in general and UV radiation in particular.

Acknowledgments

Dr. Calbó acknowledges the support from the Catalan Government for his research stay at University of Southern Queensland (USQ), through Grant AGAUR 2005 BE 00193. In addition, Dr. Calbó would like to thank USQ for the help and facilities provided during his stay. This research is partly funded by the Spanish Ministry of Science and Education through project NUCLIER (MEC CGL 2004-02325).

REFERENCES

  • Allmen, M. C., and Kegelmeyer W. P. , 1996: The computation of cloud-base height from paired whole-sky imaging cameras. J. Atmos. Oceanic Technol., 13 , 97113.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buch, K. A., Sun C-H. , and Thorne L. R. , 1995: Cloud classification using whole-sky imager data. Proc. Fifth Atmospheric Radiation Measurement (ARM) Science Team Meeting, San Diego, CA, U.S. Dept. of Energy, 35–39.

  • Calbó, J., Pagès D. , and González J. A. , 2005: Empirical studies of cloud effects on UV radiation: A review. Rev. Geophys., 43 .RG2002, doi:10.1029/2004RG000155.

    • Search Google Scholar
    • Export Citation
  • Carslaw, K. S., Harrison R. G. , and Kirkby J. , 2002: Cosmic rays, clouds, and climate. Science, 298 , 17321737.

  • Chen, Z., Zen D. , and Zhang Q. , 1994: Sky model study using fuzzy mathematics. J. Illum. Eng. Soc., 23 , 5258.

  • Crawford, J., Shetter R. E. , Lefer B. , Cantrell C. , Junkermann W. , Madronich S. , and Calvert J. , 2003: Cloud impacts on UV spectral actinic flux observed during the International Photolysis Frequency Measurement and Model Intercomparison (IPMMI). J. Geophys. Res., 108 .8545, doi:10.1029/2002JD002731.

    • Search Google Scholar
    • Export Citation
  • Duchon, C. E., and O’Malley M. S. , 1998: Estimating cloud type from pyranometer observations. J. Appl. Meteor., 38 , 132141.

  • Garand, L., 1988: Automated recognition of oceanic cloud patterns. Part I: Methodology and application to cloud climatology. J. Climate, 1 , 2039.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gedzelman, S. D., 1989: Cloud classification before Luke Howard. Bull. Amer. Meteor. Soc., 70 , 381395.

  • González, R. C., Woods R. E. , and Eddins S. L. , 2004: Digital Image Processing Using MATLAB. Prentice Hall, 609 pp.

  • Goodman, A. H., and Henderson-Sellers A. , 1988: Cloud detection and analysis: A review of recent progress. Atmos. Res., 21 , 203228.

  • Houghton, J. T., Ding Y. , Griggs D. J. , Noguer M. , Van Der Linden P. J. , Dai X. , Maskell K. , and Johnson C. A. , 2001: Climate Change 2001: The Scientific Basis. Cambridge University Press, 881 pp.

    • Search Google Scholar
    • Export Citation
  • Johnson, R. W., Hering W. S. , and Shields J. E. , 1989: Automated visibility and cloud cover measurements with a solid-state imaging system. Tech. Rep., University of California, San Diego, Scripps Institution of Oceanography, Marine Physical Laboratory, SIO Ref. 89-7, GL-TR-89-0061, 128 pp.

  • Kassianov, E., Long C. N. , and Christy J. , 2005: Cloud-base-height estimation from paired ground-based hemispherical observations. J. Appl. Meteor., 44 , 12211233.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Long, C. N., 2005: Accounting for circumsolar and horizon cloud determination errors in sky image inferral of sky cover. Proc. 15th Atmospheric Radiation Measurement (ARM) Science Team Meeting, Daytona Beach, FL, U.S. Dept. of Energy. [Available online at http://www.arm.gov/publications/proceedings/conf15/extended_gbs/long_cn2.pdf].

  • Long, C., Sabburg J. , Calbó J. , and Pagès D. , 2006: Retrieving cloud characteristics from ground-based daytime color all-sky images. J. Atmos. Oceanic Technol., 23 , 633652.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lu, D., Huo J. , and Zhang W. , 2004: All-sky visible and infrared images for cloud macro characteristics observation. Proc. 14th Int. Conf. on Clouds and Precipitation, Bologna, Italy, Institute of Atmospheric Sciences and Climate, 1127–1129.

  • Martins, F. R., Souza M. P. , and Pereira E. B. , 2003: Comparative study of satellite and ground techniques for cloud cover determination. Adv. Space Res., 32 , 22752280.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mason, N., and Hughes P. , 1998: An Introduction to Environmental Physics. Taylor and Francis, 463 pp.

  • Pagès, D., Calbó J. , Long C. N. , González J-A. , and Badosa J. , 2002: Comparison of several ground-based cloud detection techniques. Proc. XXVII General Assembly, Nice, France, European Geophys. Soc.

  • Parisi, A. V., Sabburg J. , and Kimlin M. J. , 2004: Scattered and Filtered Solar UV Measurements. Advances in Global Change Research Series, Vol. 17, Kluwer Academic, 195 pp.

    • Search Google Scholar
    • Export Citation
  • Pfister, G., McKenzie R. L. , Liley J. B. , Thomas A. , Forgan B. W. , and Long C. N. , 2003: Cloud coverage based on all-sky imaging and its impact on surface solar irradiance. J. Appl. Meteor., 42 , 14211434.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sabburg, J., and Wong J. , 1999: Evaluation of a ground-based sky camera system for use in surface irradiance measurement. J. Atmos. Oceanic Technol., 16 , 752759.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sabburg, J., and Long C. N. , 2004: Improved sky imaging for studies of enhanced UV irradiance. Atmos. Chem. Phys., 4 , 25432552.

  • Salvador, R., Calbó J. , and Millán M. M. , 1999: Horizontal grid size selection and its influence on mesocale model simulations. J. Appl. Meteor., 38 , 13111329.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Seiz, G., Baltsavias E. P. , and Gruen A. , 2002: Cloud mapping from the ground: Use of photogrammetric methods. Photogramm. Eng. Remote Sens., 68 , 941951.

    • Search Google Scholar
    • Export Citation
  • Shields, J. E., Johnson R. W. , Karr M. E. , Burden A. R. , and Baker J. G. , 2003: Daylight visible/NIR whole sky imagers for cloud and radiance monitoring in support of UV research programs. Proc. SPIE, 5156 , 155166.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Singh, M., and Glennen M. , 2005: Automated ground-based cloud recognition. Pattern Anal. Applic., 8 .doi:10.1007/s10044-005-0007-5.

  • Souza-Echer, M. P., Pereira E. B. , Bins L. S. , and Andrade M. A. R. , 2006: A simple method for the assessment of the cloud cover state in high-latitude regions by a ground-based digital camera. J. Atmos. Oceanic Technol., 23 , 437447.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tooman, T. P., 2003: Whole sky imager retrieval guide. Atmospheric Radiation Measurement Program Tech. Rep. ARM TR-011.1, 109 pp. [Available online at http://www.arm.gov/publications/tech_reports/arm-tr-011-1.pdf.].

  • WMO, 1975: International Cloud Atlas. Vol. I. World Meteorological Organization, 155 pp.

  • WMO, 1987: International Cloud Atlas. Vol. II. World Meteorological Organization, 212 pp.

Fig. 1.
Fig. 1.

Sample of images used in our development (from top to bottom: clear, class A, class G). All images are from the WSC. The logarithm of the spectral power values of the R/B image is represented in grayscale. A 5 × 5 averaging running filter has also been applied to reduce nonphysical variability in the spectral power.

Citation: Journal of Atmospheric and Oceanic Technology 25, 1; 10.1175/2007JTECHA959.1

Fig. 2.
Fig. 2.

Example of the determination of the spectral intensity feature, which is the absolute value of the slope of the lines shown here for two cases: clear-sky image (open squares and thin line) and class A image (black squares and thick line).

Citation: Journal of Atmospheric and Oceanic Technology 25, 1; 10.1175/2007JTECHA959.1

Fig. 3.
Fig. 3.

An example of the parallelepiped technique for classification. The range of possible values for each sky condition for the two features (SI and CC) is shown by the rectangles. Each rectangle is labeled in its upper-left-hand corner.

Citation: Journal of Atmospheric and Oceanic Technology 25, 1; 10.1175/2007JTECHA959.1

Table 1.

Sky condition classes used in this study.

Table 1.
Table 2.

Typical range of values (min and max) for each feature and sky condition derived from the combination of the two training sets of images (WSC1 and TSI1).

Table 2.
Table 3.

Contingency matrices between the manual classification and the automatic classification, for the validation sets (top) TSI2 and (bottom) WSC2. Values in parentheses () represent images that are classified in two classes. Values in brackets [] represent images that are classified in three classes.

Table 3.
Table 4.

Contingency matrix between the manual classification and the automatic classification, for the two validation sets together (TSI2 and WSC2) and grouping classes A, C, and D; and G and H. Values in parentheses () represent images that are classified in two classes.

Table 4.
Save
  • Allmen, M. C., and Kegelmeyer W. P. , 1996: The computation of cloud-base height from paired whole-sky imaging cameras. J. Atmos. Oceanic Technol., 13 , 97113.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Buch, K. A., Sun C-H. , and Thorne L. R. , 1995: Cloud classification using whole-sky imager data. Proc. Fifth Atmospheric Radiation Measurement (ARM) Science Team Meeting, San Diego, CA, U.S. Dept. of Energy, 35–39.

  • Calbó, J., Pagès D. , and González J. A. , 2005: Empirical studies of cloud effects on UV radiation: A review. Rev. Geophys., 43 .RG2002, doi:10.1029/2004RG000155.

    • Search Google Scholar
    • Export Citation
  • Carslaw, K. S., Harrison R. G. , and Kirkby J. , 2002: Cosmic rays, clouds, and climate. Science, 298 , 17321737.

  • Chen, Z., Zen D. , and Zhang Q. , 1994: Sky model study using fuzzy mathematics. J. Illum. Eng. Soc., 23 , 5258.

  • Crawford, J., Shetter R. E. , Lefer B. , Cantrell C. , Junkermann W. , Madronich S. , and Calvert J. , 2003: Cloud impacts on UV spectral actinic flux observed during the International Photolysis Frequency Measurement and Model Intercomparison (IPMMI). J. Geophys. Res., 108 .8545, doi:10.1029/2002JD002731.

    • Search Google Scholar
    • Export Citation
  • Duchon, C. E., and O’Malley M. S. , 1998: Estimating cloud type from pyranometer observations. J. Appl. Meteor., 38 , 132141.

  • Garand, L., 1988: Automated recognition of oceanic cloud patterns. Part I: Methodology and application to cloud climatology. J. Climate, 1 , 2039.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Gedzelman, S. D., 1989: Cloud classification before Luke Howard. Bull. Amer. Meteor. Soc., 70 , 381395.

  • González, R. C., Woods R. E. , and Eddins S. L. , 2004: Digital Image Processing Using MATLAB. Prentice Hall, 609 pp.

  • Goodman, A. H., and Henderson-Sellers A. , 1988: Cloud detection and analysis: A review of recent progress. Atmos. Res., 21 , 203228.

  • Houghton, J. T., Ding Y. , Griggs D. J. , Noguer M. , Van Der Linden P. J. , Dai X. , Maskell K. , and Johnson C. A. , 2001: Climate Change 2001: The Scientific Basis. Cambridge University Press, 881 pp.

    • Search Google Scholar
    • Export Citation
  • Johnson, R. W., Hering W. S. , and Shields J. E. , 1989: Automated visibility and cloud cover measurements with a solid-state imaging system. Tech. Rep., University of California, San Diego, Scripps Institution of Oceanography, Marine Physical Laboratory, SIO Ref. 89-7, GL-TR-89-0061, 128 pp.

  • Kassianov, E., Long C. N. , and Christy J. , 2005: Cloud-base-height estimation from paired ground-based hemispherical observations. J. Appl. Meteor., 44 , 12211233.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Long, C. N., 2005: Accounting for circumsolar and horizon cloud determination errors in sky image inferral of sky cover. Proc. 15th Atmospheric Radiation Measurement (ARM) Science Team Meeting, Daytona Beach, FL, U.S. Dept. of Energy. [Available online at http://www.arm.gov/publications/proceedings/conf15/extended_gbs/long_cn2.pdf].

  • Long, C., Sabburg J. , Calbó J. , and Pagès D. , 2006: Retrieving cloud characteristics from ground-based daytime color all-sky images. J. Atmos. Oceanic Technol., 23 , 633652.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Lu, D., Huo J. , and Zhang W. , 2004: All-sky visible and infrared images for cloud macro characteristics observation. Proc. 14th Int. Conf. on Clouds and Precipitation, Bologna, Italy, Institute of Atmospheric Sciences and Climate, 1127–1129.

  • Martins, F. R., Souza M. P. , and Pereira E. B. , 2003: Comparative study of satellite and ground techniques for cloud cover determination. Adv. Space Res., 32 , 22752280.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Mason, N., and Hughes P. , 1998: An Introduction to Environmental Physics. Taylor and Francis, 463 pp.

  • Pagès, D., Calbó J. , Long C. N. , González J-A. , and Badosa J. , 2002: Comparison of several ground-based cloud detection techniques. Proc. XXVII General Assembly, Nice, France, European Geophys. Soc.

  • Parisi, A. V., Sabburg J. , and Kimlin M. J. , 2004: Scattered and Filtered Solar UV Measurements. Advances in Global Change Research Series, Vol. 17, Kluwer Academic, 195 pp.

    • Search Google Scholar
    • Export Citation
  • Pfister, G., McKenzie R. L. , Liley J. B. , Thomas A. , Forgan B. W. , and Long C. N. , 2003: Cloud coverage based on all-sky imaging and its impact on surface solar irradiance. J. Appl. Meteor., 42 , 14211434.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sabburg, J., and Wong J. , 1999: Evaluation of a ground-based sky camera system for use in surface irradiance measurement. J. Atmos. Oceanic Technol., 16 , 752759.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Sabburg, J., and Long C. N. , 2004: Improved sky imaging for studies of enhanced UV irradiance. Atmos. Chem. Phys., 4 , 25432552.

  • Salvador, R., Calbó J. , and Millán M. M. , 1999: Horizontal grid size selection and its influence on mesocale model simulations. J. Appl. Meteor., 38 , 13111329.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Seiz, G., Baltsavias E. P. , and Gruen A. , 2002: Cloud mapping from the ground: Use of photogrammetric methods. Photogramm. Eng. Remote Sens., 68 , 941951.

    • Search Google Scholar
    • Export Citation
  • Shields, J. E., Johnson R. W. , Karr M. E. , Burden A. R. , and Baker J. G. , 2003: Daylight visible/NIR whole sky imagers for cloud and radiance monitoring in support of UV research programs. Proc. SPIE, 5156 , 155166.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Singh, M., and Glennen M. , 2005: Automated ground-based cloud recognition. Pattern Anal. Applic., 8 .doi:10.1007/s10044-005-0007-5.

  • Souza-Echer, M. P., Pereira E. B. , Bins L. S. , and Andrade M. A. R. , 2006: A simple method for the assessment of the cloud cover state in high-latitude regions by a ground-based digital camera. J. Atmos. Oceanic Technol., 23 , 437447.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • Tooman, T. P., 2003: Whole sky imager retrieval guide. Atmospheric Radiation Measurement Program Tech. Rep. ARM TR-011.1, 109 pp. [Available online at http://www.arm.gov/publications/tech_reports/arm-tr-011-1.pdf.].

  • WMO, 1975: International Cloud Atlas. Vol. I. World Meteorological Organization, 155 pp.

  • WMO, 1987: International Cloud Atlas. Vol. II. World Meteorological Organization, 212 pp.

  • Fig. 1.

    Sample of images used in our development (from top to bottom: clear, class A, class G). All images are from the WSC. The logarithm of the spectral power values of the R/B image is represented in grayscale. A 5 × 5 averaging running filter has also been applied to reduce nonphysical variability in the spectral power.

  • Fig. 2.

    Example of the determination of the spectral intensity feature, which is the absolute value of the slope of the lines shown here for two cases: clear-sky image (open squares and thin line) and class A image (black squares and thick line).

  • Fig. 3.

    An example of the parallelepiped technique for classification. The range of possible values for each sky condition for the two features (SI and CC) is shown by the rectangles. Each rectangle is labeled in its upper-left-hand corner.

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 1331 348 39
PDF Downloads 1131 307 27