Abstract
In this paper, a novel Multiview CLOUD (mCLOUD) visual feature extraction mechanism is proposed for the task of categorizing clouds based on ground-based images. To completely characterize the different types of clouds, mCLOUD first extracts the raw visual descriptors from the views of texture, structure, and color simultaneously, in a densely sampled way—specifically, the scale invariant feature transform (SIFT), the census transform histogram (CENTRIST), and the statistical color features are extracted, respectively. To obtain a more descriptive cloud representation, the feature encoding of the raw descriptors is realized by using the Fisher vector. This is followed by the feature aggregation procedure. A linear support vector machine (SVM) is employed as the classifier to yield the final cloud image categorization result. The experiments on a challenging cloud dataset termed the six-class Huazhong University of Science and Technology (HUST) cloud demonstrate that mCLOUD consistently outperforms the state-of-the-art cloud classification approaches by large margins (at least 6.9%) under all the different experimental settings. It has also been verified that, compared to the single view, the multiview cloud representation generally enhances the performance.