An instance-based nearest-neighbor algorithm was developed for a Geostationary Operational Environmental Satellite (GOES) cloud classifier. Expert-labeled samples serve as the training sets for the various GOES image classification scenes. The initial implementation of the classifier using the complete set of available training samples has proven to be an inefficient method for real-time image classifications, requiring long computational run times and significant computer resources. A variety of training-set reduction methods were examined to find smaller training sets that provide quicker classifier run times with minimal reduction in classifier testing set accuracy. General differences within real-time image classifications as a result of using the various reduction methods were also analyzed. The fast condensed nearest-neighbor (FCNN) method reduced the size of the individual training sets by 68.3% (fourfold cross-validation testing average) while the average overall accuracy of the testing sets decreased by only 4.1%. Training sets resulting from these reduction methods were also applied within a real-time classifier using a one-nearest-neighbor subroutine. Using the FCNN-reduced set, the subroutine run time on a 30° latitude × 30° longitude image (GOES-10 daytime) with 11 289 600 total pixels decreased by an average of 60.7%.
Following the motivation and method described in Tag et al. (2000), a one-nearest-neighbor Geostationary Operational Environmental Satellite (GOES) cloud classifier was developed to analyze imagery in terms of specific and general cloud types, and snow cover, sun-glint, and clear-sky (no clouds, ground snow, or sun glint) conditions. The classifier algorithm can be applied to both day and night imagery from either the GOES-10 or GOES-12 satellite as opposed to the daytime-only Advanced Very High Resolution Radiometer (AVHRR) classifier described in Tag et al. (2000). Also, all visible and infrared channels (excluding the 13.3-μm channel from GOES-12) were used in the classifier development.
An instance-based classification algorithm employs a machine learning technique in which training datasets are stored in their entirety and a distance function is used to make predictions. For most purposes, operational users rely on the availability of near-real-time cloud classification data or imagery. This requirement can be difficult to achieve with instance-based classifier applications that use a large training set of samples, as is the case of the one-nearest-neighbor classifier used in this study. Enhancing the classifier in order to optimize speed and accuracy is the primary motivation of this research.
In the recent past, various automated cloud classification algorithms have been developed for a variety of intended purposes. Such uses include comparisons of different sensors in cloud detection and classification abilities (Li et al. 2005), enhancement of rain-rate algorithms (Hong et al. 2004; Miller and Emery 1997; Uddstrom and Gray 1996; Liu et al. 1995), and classification of cloud and surface types to benefit cloud information retrievals (Li et al. 2003).
Cloud classification algorithms have been applied to data from various satellites and sensors, including the Advanced Very High Resolution Radiometer (AVHRR) on National Oceanic and Atmospheric Administration (NOAA) satellites (Pavolonis et al. 2005; Derrien and LeGleau 1999; Baum et al. 1997; Miller and Emery 1997; Uddstrom and Gray 1996), the Moderate Resolution Imaging Spectroradiometer (MODIS) on the Earth Observing System (EOS) Terra and Aqua satellites (Li et al. 2005; Lee et al. 2004; Li et al. 2003), the Imager on GOES I–M (Hong et al. 2004; Tian et al. 1999), the Multispectral Radiometer (MSR) on the European Meteorological Satellite (Meteosat) (Lewis et al. 1997), and the Visible Infrared Spin Scan Radiometer (VISSR) on the Japanese Geostationary Meteorological Satellite (GMS) (Liu et al. 1995).
Cloud classifier methodologies are varied and not limited to nearest-neighbor and other instance-based algorithms. These methodologies have included thresholding techniques (Pavolonis et al. 2005; Derrien and LeGleau 1999; Liu et al. 1995), maximum likelihood (Li et al. 2005; Li et al. 2003), unsupervised clustering (Hong et al. 2004), multicategory support vector machines (Lee et al. 2004), neural networks (Tian et al. 1999; Miller and Emery 1997; Lewis et al. 1997), fuzzy logic (Baum et al. 1997), and Bayesian discriminant functions (Uddstrom and Gray 1996). An extensive review of earlier cloud classification research can be found in Pankiewicz (1995).
A one-nearest-neighbor procedure (Cover and Hart 1967) serves as the instance-based classification methodology (Aha et al. 1991) for the classifier described here. Similar to the methods in Tag et al. (2000), eight training sets of expert-labeled samples were created to serve as the instances from which the nearest neighbor is found for an unclassified (or testing) image sample. Three weather and imagery analysis experts classified samples independently, and only those samples for which all experts agreed were used in the training sets. These eight datasets are differentiated by a unique combination of the following three characteristics of the unknown sample: 1) GOES-10 or GOES-12, 2) day or night, and 3) land or sea background. Daytime samples were defined by solar zenith angles less than 85° with nighttime samples requiring solar zenith angles greater than 90°. The output classes discriminated within the classifier are listed in Table 1. The lack of the higher-resolution visible channel necessitates that the nighttime classes be more general in nature and that the sample size be 32 × 32 pixels as opposed to the 16 × 16 pixel size used for the daytime classifier. The characteristic features or attributes described in Tag et al. (2000) were used to represent each training sample. These spectral, textural, and ancillary features were derived from the GOES visible and infrared channels or from specific data information (e.g., latitude, date). From the large feature set (100 + features), a feature selection algorithm was applied to each of the eight training sets. The selection algorithm used was the beam variant of the backward sequential selection (BSS) described in Bankert and Aha (1996). The selected set of features for each of the eight training sets can be found in Tables 2 and 3. Note that in addition to individual channel features being extracted, channel difference (e.g., channel 2 − channel 4) features were also computed. Also, the sea surface temperature (SST) was taken from a National Centers for Environmental Prediction (NCEP) climatological database. For detailed information on the computation of textural features [mean difference, mean sum, contrast, angular second moment, entropy, cluster shade, and standard deviation (SD)], see Tag et al. (2000) and Welch et al. (1992).
The sizes of the eight training sets employed by the GOES cloud classifier are listed in Table 4. In an operational setting, when a real-time image is presented to the classifier, the one-nearest-neighbor algorithm computes the distance (within the appropriate feature space) between a given sample within the image and each of the training-set samples. This is repeated for all image samples and, for large images, can result in long run times and the intense use of computer resources. Mitigating these potential problems, by employing training-set reduction methods without compromising classifier accuracy, is the goal of this research. The focus here is not on developing new training-set reduction techniques but on how current techniques can be applied to a cloud classification dataset.
2. Training-set reduction methods
The most basic learning method to apply to an instance-based algorithm is to store all training-set samples. For large training sets, this method requires a relatively large amount of computer memory resulting in the slow execution of the classifier. Classifier generalization may also be degraded with the inclusion of noisy samples (Wilson and Martinez 2000). The long computer run time renders a quick turnaround for near-real-time image classifications costly, if not impossible. To provide potential users with near-real-time classifications, a number of training-set reduction methods were tested. The goal of these tests was to make the training set as small as possible with minimal effect on the classifier accuracy.
From the application of various training-set reduction techniques, retaining central samples in each class or finding those training-set samples that reside near feature space decision boundaries are two of the possible results (Wilson and Martinez 2000). Figure 1 is an example of a training set within feature space for a two-class (class X and class O), two-feature application of a one-nearest-neighbor algorithm. When the complete training set is reduced to central samples, only red samples would be included. Conversely, only the green samples would be included when the training set is reduced to samples near decision boundaries. If testing sample A is being classified, it would be classified as an O class for the red training set and as an X class for the green training set. Samples B and C would be classified as O and X, respectively, regardless of whether the reduced training set consisted of central samples or boundary samples.
One methodology for creating a reduced training set of central feature space samples is using the feature space centroids of each class as the training samples. This method also creates the smallest training set that is equal to the number of classes (i.e., one training sample per class). Each class centroid for this study consisted of the average value of each feature for each class.
Another methodology for creating a reduced training set of central samples is to remove any training-set sample that has at least one feature that falls more than a distance of σ standard deviations outside of the feature mean. The smaller σ is, the smaller the size of the cluster that represents each class. The σ values of one and two were used here.
Hart (1968) introduced a training-set reduction method called the condensed nearest-neighbor (CNN) rule, which produces a training set of samples close to class boundaries in feature space. This method results in a subset of training samples such that every training sample not in the subset would be classified correctly by the subset in a one-nearest-neighbor classifier. CNN begins with a random selection of one training sample in each class to be placed in the subset. If a sample from the remaining training set is misclassified using only this subset of training samples, then that sample is added to the subset immediately. This process continues until no more samples can be added to the subset. This method should result in training samples that are closest to the class boundaries in feature space. A variant of this method is known as the fast condensed nearest-neighbor (FCNN) rule (Angiulli 2005), which starts the subset selection with the class centroids instead of a random selection.
3. Method comparison
Four different training-set reduction methods were compared with a nonreduced training set by performing a fourfold cross validation of each of the eight complete training datasets. For a fourfold cross validation, the data are randomly split into four subsets. Each subset is iteratively held out as the testing set and the remaining three subsets are combined to form the training data. Each training set was reduced by each of the training-set reduction methods, creating a total of five training sets (four reduced sets and one nonreduced set) for each of the eight original datasets. The four reduced training sets are 1) centroids only, 2) samples with all features within one standard deviation of class/feature mean (noted as SD1 hereinafter), 3) samples with all features within two standard deviations of class/feature mean (noted as SD2 hereinafter), and 4) samples selected using the FCNN rule. To clarify this description, Fig. 2 is a diagram example for the data distribution of a given training dataset. A one-nearest-neighbor algorithm was performed on each testing set sample using the reduced training set as the search space for a nearest neighbor.
The overall accuracy (average of the fourfold cross validation) for each testing set when the nonreduced training set was used is listed in Table 5. The classification accuracy at night was predictably lower for all cases resulting from the lack of visible channel data, but all accuracies were greater than 77%. For individual classes (not shown here), cirrocumulus (Cc) clouds were the most misclassified (by percentage) for daytime classifications, while mixed and high, thin cloud classes were consistently the most misclassified for nighttime classifications.
In comparison with the nonreduced training set, the average (of the fourfold cross validation) percent training-set size reduction, the average decrease in accuracy for the testing set, and average percent decrease in run times (using near-real-time images) for each reduced training set are computed and presented in Table 6. The size and accuracy reductions had fairly small variability among the eight datasets for any given training-set reduction method. Typical run times when using the complete training sets ranged from 8 to 12 min on a single processor machine for the image size used here (30° latitude × 30° longitude). Longer run times were observed for the classification of larger areas. The differences in run-time reduction between GOES-10 versus GOES-12 and day versus night need further investigation. The difference in the total number of training samples (Table 4), with the complete GOES-10 datasets being much larger than GOES-12 datasets and the sample size differences in day versus night within real-time image classifications may be contributing factors.
For three of the methods (excluding SD2), the size of the training set was reduced by at least 61%. The following questions need to be answered: 1) Do these large training-set reductions result in a huge cost in accuracy for the one-nearest-neighbor algorithm, and 2) do they reduce the run time significantly? As expected, the greater the size reduction, the greater the run-time reduction. Non-SD2 methods had at least a 24% average run-time reduction over all of the datasets, with all three methods providing more than a 60% average run-time reduction for the GOES-10 day real-time classifications. While the SD1 and the FCNN methods have similar run-time reductions, the FCNN method performed better in terms of minimizing the accuracy losses over the testing sets with a maximum decrease of 5.5% overall. The centroid method provides the greatest benefit in terms of computer run times with an average of 50.4% reduction over all datasets, but at a much greater cost in accuracy (29.4% average decrease). The SD2 method accuracies were higher than the centroid and SD1 methods for all datasets. However, despite the much larger training-set size, this method did not provide significant accuracy improvement when compared with the FCNN method. The FCNN method produces less accuracy reduction (i.e., higher accuracy) for half of the eight datasets when compared with the accuracy reduction using the SD2 method with a much quicker run time. A graph of the size, accuracy, and run-time reduction averages over all eight datasets is displayed in Fig. 3. The SD2 method provided the smallest decrease (3.8%) in accuracy, but the FCNN method had a similar decrease (4.1%) and resulted in a greater reduction in computer run time (39.4% versus 14.8%).
While a direct linear relationship between training size reduction and decrease in computer run time appears to be evident in Fig. 3, examining the individual datasets (Table 6) provides a different observation. For example, GOES-10 day centroid datasets were created from a 99.6% (land) and 99.7% (sea) reduction from the total training dataset. The run-time reduction for GOES-10 day (land-and-sea-combined image) was 71.9%. GOES-12 day centroid datasets were similar in size reduction to that of the GOES-10 day dataset, but run-time reduction was only 45.9%. Among the reasons for these differences could be dataset size and/or feature space size variability.
Although an in-depth interpretation is beyond the scope of this paper, some tendencies appeared in the individual class accuracies throughout the various testing set results. For the daytime land datasets and all four nighttime datasets, the convective classes [day: cumulonimbus (Cb), night: deep convection (DC)] have higher accuracies when using the SD2 training set as opposed to using the nonreduced training set. One possible explanation is that these classes are unique within their respective feature spaces and the SD2 training-set cluster becomes very isolated. This type of feature space distribution would allow any Cb or DC testing sample, whether it falls within, near, or far outside the cluster, to be classified correctly. When the complete dataset was used, training samples closer to class decision boundaries for all classes are included, thus allowing more Cb or DC testing samples to be misclassified. This explanation was reinforced by the fact that for all eight datasets, Cb (day) or DC (night) testing samples had higher accuracies using the SD2 training sets than when using the FCNN training sets. The samples in the FCNN training sets were closer to class boundaries than those within a cluster near the centroid. Conversely, stratus-type clouds [day: stratus (St) and altostratus (As), night: high thick (mainly cirrostratus)] had higher accuracies when the FCNN training sets were used as opposed to the SD2 training sets. The chosen feature sets may limit the distance between the SD2 clusters for these classes, requiring the inclusion of boundary or outlier samples to maintain the class accuracies in the testing set results.
The statistics generated from the various training–testing set combinations provide a measure of the potential performance of a given algorithm. Examining images generated with the reduced training sets is also informative. As an example, a GOES-10 daytime (with both land and sea in the image) dataset was classified five times using each of the training sets; the complete set and the four reduced sets using the methods are described here. Visible and infrared as well as classification images for this example are displayed in Figs. 4 –10. The differences among the classification images are visually distinguishable and provide an illustration of how the classification can be affected by changes in the training set. To obtain an indication of the quantitative differences when examining real-time examples, a set of 32 images was classified using all of the appropriate reduced training sets along with a classification using the complete training set. Using the complete training-set classification image as the benchmark, the average percent pixel differences are listed in Table 7. As seen in the testing data, SD2 and FCNN provided the closest approximation to the benchmark. Taking into account the significant differences in run time, the FCNN method would appear to be the preferred method for training-set reduction. While all 32 images were from the same time of year (late autumn/early winter), similar results should occur throughout the year because the same training data and classes are applied through all seasons.
For a successful instance-based cloud classifier, collecting as much training-set data as is needed to represent the universe of possibilities for the given classes appears to be a goal. However, reaching this goal may not always be desirable. Not only could a large training dataset introduce noise that can degrade classifier performance, it may make near-real-time analysis resulting from excessive computer run times cumbersome, at a minimum, and nearly impossible in certain situations. The motivation behind this research was to find methods to reduce the size of training sets and the associated computation run time with minimal accuracy impact.
While there appears to be some individual class preference to using clustered central training samples (i.e., SD2 method) rather than class boundary training samples (i.e., FCNN method), the overall accuracy differences were minimal for nearly all of the eight datasets. For this type of imagery analysis, employing the FCNN reduced training sets provided greater overall benefits with minimal accuracy reductions, significantly reduced computer run times, and faster near-real-time classifier output. It is worth noting that the FCNN method is not necessarily going to be the best method for every one-nearest-neighbor problem. Data type, data amount, and features used would also have an impact. In addition, a different reduced training-set method may work best if a different instance-based classification algorithm (other than nearest neighbor) were used on the same dataset.
The choice of the training-set reduction method, as well as the choice of the classification method, is guided by user needs. Additional supervised machine learning algorithms and/or training-set reduction methods could be examined in future research.
The support of the sponsor, the Federal Aviation Administration (FAA) through the Aviation Weather Research Program (AWRP), is gratefully acknowledged. The views expressed are those of the authors and do not necessarily represent the official policy or position of the FAA. Past support for cloud classification research from the Office of Naval Research and Oceanographer of the Navy is greatly appreciated. Thanks are extended to those past and present members of the Satellite Meteorological Applications section of the Naval Research Laboratory Marine Meteorology Division for their advice and support.
Corresponding author address: Richard Bankert, Naval Research Laboratory, 7 Grace Hopper Ave., Monterey, CA 93943-5502. Email: email@example.com